Raman Balyan Raman Balyan - 1 month ago 8
Python Question

Data processing using Python Pandas

I have a CSV file in the format mentioned below:

API Name Test Result Risk Rating Vulnerability Category

https://api-test.com FAIL LOW Information Gathering
https://api-test1.com PASS MEDIUM Authentication Test
https://api-test2.com SKIP HIGH Web Service
https://api-test1.com FAIL CRITICAL Configuration Management


I am using pandas library for data processing. Now, there could be repetition of API urls as you can see from the table. So, what I want is to get deatils of same API in Dataframe. For ex: API name variable for the API "https://api-test1.com" should contain data like this:

API Name Test Result Risk Rating Vulnerability Category

https://api-test1.com PASS MEDIUM Authentication Test
https://api-test1.com FAIL CRITICAL Configuration Management


Similarly variable for API2 should contain the data related to all the API2. Thanks!

Answer

you can use .duplicated(keep=False) method:

In [138]: df['API Name'].duplicated(keep=False)
Out[138]:
0    False
1     True
2    False
3     True
Name: API Name, dtype: bool

In [139]: df[df['API Name'].duplicated(keep=False)]
Out[139]:
                API Name Test Result Risk Rating    Vulnerability Category
1  https://api-test1.com        PASS      MEDIUM       Authentication Test
3  https://api-test1.com        FAIL    CRITICAL  Configuration Management

UPDATE: you don't need such variables (api1, api2, etc.) as you can always easily access your data in the DataFrame:

In [152]: apis = df['API Name'].unique()

In [153]: apis
Out[153]: array(['https://api-test.com', 'https://api-test1.com', 'https://api-test2.com'], dtype=object)

In [154]: for api in apis:
     ...:     print(df.loc[df['API Name'] == api])
     ...:
               API Name Test Result Risk Rating Vulnerability Category
0  https://api-test.com        FAIL         LOW  Information Gathering
                API Name Test Result Risk Rating    Vulnerability Category
1  https://api-test1.com        PASS      MEDIUM       Authentication Test
3  https://api-test1.com        FAIL    CRITICAL  Configuration Management
                API Name Test Result Risk Rating Vulnerability Category
2  https://api-test2.com        SKIP        HIGH            Web Service