import csv
from collections import defaultdict
# a dictionary whose value defaults to a list.
data = defaultdict(list)
# open the csv file and iterate over its rows. the enumerate()
# function gives us an incrementing row number
for i, row in enumerate(csv.reader(open('python_test.csv', 'rb'))):
# skip the header line and any empty rows
# we take advantage of the first row being indexed at 0
# i=0 which evaluates as false, as does an empty row
if not i or not row:
continue
# unpack the columns into local variables
_, zipcode, level = row
# for each zipcode, add the level the list
data[zipcode].append(float(level))
# loop over each zipcode and its list of levels and calculate the average
for zipcode, levels in data.iteritems():
print zipcode, sum(levels) / float(len(levels))
[python_test.csv]
ID ZIPCODE RATE 1 19003 27.5 2 19003 31.33 3 19083 41.4 4 19083 17.9 5 19102 21.4
위의 Input data가 소스코드를 통해 다음과 같은 결과로 나온다(Python 2.5 이상에서 정상적으로 작동된다.)
19003 29.415 19083 29.65 19102 21.4
Advanced Source code
import csv
from collections import defaultdict
# a dictionary whose value defaults to a list.
data = defaultdict(list)
data2 = defaultdict(list)
f = open('python_test.txt', 'r')
fList = []
for line in f.readlines():
fList.append(line.split('\t'))
f.close()
# open the csv file and iterate over its rows. the enumerate()
# function gives us an incrementing row number
for i, row in enumerate(fList):
# skip the header line and any empty rows
# we take advantage of the first row being indexed at 0
# i=0 which evaluates as false, as does an empty row
if not i or not row:
continue
# unpack the columns into local variables
_, zipcode, level, info = row
# for each zipcode, add the level the list
data[zipcode].append(float(level))
data2[zipcode].append(info.strip())
# loop over each zipcode and its list of levels and calculate the average
for zipcode, levels in data.iteritems():
print zipcode, sum(levels) / float(len(levels)), '|'.join(data2[zipcode])
Reference
<defaultdict>
http://docs.python.org/release/2.5.2/lib/defaultdict-examples.html
<sourcecode>
http://stackoverflow.com/questions/5328971/python-csv-need-to-group-and-calculate-values-based-on-one-key
'Technology > Programming' 카테고리의 다른 글
| Python / 숫자, 소수점 정규표현식 (0) | 2012.07.23 |
|---|---|
| Python / Text parsing and match (0) | 2012.05.11 |
| Python / MBR MeSH 파싱하기 (0) | 2012.01.17 |
| PHP / 올바른 코딩, 잘못된 코딩 (0) | 2011.12.15 |
| Javascript / AJAX 콜백에서 팝업창 띄우기(window.open) (0) | 2011.12.14 |