fourth fourth - 17 days ago 4
Python Question

How to automatically create series number when csv_read()?

I get one data csv file from github and use pd.csv_read() to read it. it would automatically create series number like this.

label repeattrips id offer_id never_bought_company \
0 1 5 86246 1208251 0
1 1 16 86252 1197502 0
2 0 0 12682470 1197502 1
3 0 0 12996040 1197502 1
4 0 0 13089312 1204821 0
5 0 0 13179265 1197502 1
6 0 0 13251776 1200581 0


but when I create my csv file and read it.

label gender age_range action0 action1 action2 action3 first \
0 0 2 1 0 1 0 2 1
0 0 4 0 0 1 0 1 1
0 1 2 8 0 1 0 9 1
1 0 2 0 0 1 0 1 1
0 1 5 0 0 1 0 1 1
0 1 5 0 0 1 0 1 1


the label is regarded as series number in my output.

If I create a series number in the front of every line of my data, still didn't solve the problem. like this:

label gender age_range action0 action1 action2 action3 first \
0 0 0 2 1 0 1 0 2 1
1 0 0 4 0 0 1 0 1 1
2 0 1 2 8 0 1 0 9 1
3 1 0 2 0 0 1 0 1 1
4 0 1 5 0 0 1 0 1 1
5 0 1 5 0 0 1 0 1 1
6 0 0 7 5 0 1 0 6 1
7 0 0 7 1 0 1 0 2 1


I don't know if I saved it right. My csv data is like this (added series number) and the github file looks similar format as well:

label gender age_range action0 action1 action2 action3 first second third fourth sirstrate secondrate thirdrate fourthrate total_cat total_brand total_time total_items users_appear users_items users_cats users_brands users_times users_action0 users_action1 users_action2 users_action3 merchants_appear merchants_items merchants_cats merchants_brands merchants_times merchants_action0 merchants_action1 merchants_action2 merchants_action3
0 0 0 2 1 0 1 0 2 1 1 0 0.0224719101124 0.5 0.5 0 1 1 1 1 89 71 22 45 17 87 0 2 0 46 34 11 16 3 38 4 2 2
1 0 0 4 0 0 1 0 1 1 1 0 0.00469483568075 0.0232558139535 0.0232558139535 0.0 1 1 1 1 213 102 47 44 30 170 0 36 7 103 58 25 23 6 81 0 22 0
2 0 1 2 8 0 1 0 9 1 1 0 0.0157342657343 0.0181818181818 0.0181818181818 0.0 2 2 1 5 572 393 111 158 60 517 0 15 40 119 70 24 20 17 106 6 7 0
3 1 0 2 0 0 1 0 1 1 1 0 0.0142857142857 0.0769230769231 0.0769230769231 0.0 1 1 1 1 70 33 19 15 15 57 0 11 2 27 17 11 15 11 18 0 2 7
4 0 1 5 0 0 1 0 1 1 1 0 0.025641025641 0.2 0.2 0.0 1 1 1 1 39 32 16 29 14 34 0 4 1 133 88 26 25 11 128 0 5 0


one line in one blank, rather than every item of one line in one blank.

Could you tell me how to solve this?

Answer

You'll need to provide code to get more substantive help since it's unclear why you're facing a problem. For example, copying the data you pasted at the bottom reads in just fine with pd.read_clipboard(), and pd.read_csv() should also work fine as long as you set it up with a space separator:

In [2]: pd.read_clipboard()
Out[2]:
   label  gender  age_range  action0  action1  action2  action3  first  \
0      0       0          2        1        0        1        0      2
1      0       0          4        0        0        1        0      1
2      0       1          2        8        0        1        0      9
3      1       0          2        0        0        1        0      1
4      0       1          5        0        0        1        0      1

   second  third        ...          users_action3  merchants_appear  \
0       1      1        ...                      0                46
1       1      1        ...                      7               103
2       1      1        ...                     40               119
3       1      1        ...                      2                27
4       1      1        ...                      1               133

   merchants_items  merchants_cats  merchants_brands  merchants_times  \
0               34              11                16                3
1               58              25                23                6
2               70              24                20               17
3               17              11                15               11
4               88              26                25               11

   merchants_action0  merchants_action1  merchants_action2  merchants_action3
0                 38                  4                  2                  2
1                 81                  0                 22                  0
2                106                  6                  7                  0
3                 18                  0                  2                  7
4                128                  0                  5                  0

[5 rows x 37 columns]