Skip to content

Quick Start

robertkearns edited this page Jul 14, 2019 · 11 revisions

Quick Start:

These examples use a trivial json response from the twitter API, located here

import Aphid
nested = twitter_json

Aphid.search(iterable, searchfor, searchtype='key', **kwargs):

search Finds the first result and returns it as a tuple (key, value). To get the first result for the original quoted status:

result = Aphid.search(twitter_json, 'quoted_status`)
>>>result
('quoted_status', {'text': 'original message', 'user': {'screen_name': 'OriginalTweeter'}, 'place': {}, 'entities': {}, 'extended_entities': {}})

Aphid.findall(iterable, searchfor, max_=0, searchtype='key', **kwargs):

Finds all results and returns as a list of tuples[(key, value), (key, value)]. max_ is an optional keyword to limit the number of results found and defaults to 0. To find all text and title blocks:

results = Aphid.findall(twitter_json, ['text', 'title', 'description'])
>>>results
[('text', 'My added comments to this Tweet ---> https:\\/\\/t.co\\/LinkToTweet'), ('text', 'original message'), ('title', 'RT with comment, aka Quote Tweets, now may have media attached across Twitter APIs'),
 ('description', 'Today were announcing a new feature that allows userspeople on Twitter to add a GIF, video, or photo to their Retweet with comments, aka Quote Tweets. Developers will start to see additional media metadata included in the payload for Quote Tweets.
 Quote Tweets with media will be rendered very similarly to Tweets with media across our APIs. This rendering is an addition to the existing media field, so this should not be a breaking change for application owners, if they are already ingesting med...')]

Aphid.sub(iterable, searchfor, new, max_=0, searchtype='key', **kwargs):

*Note: sub changes the original file and returns None.
**Note: The real power of sub comes from subclassing the on_match method.

The sub function goes through and replaces values based on a specified key or value. new represents the text that will be replacing the old. As a trivial example lets go back to the first one and replace all text keys with 'This is a modified value'

Aphid.sub(twitter_json, 'text', 'this is a modified value')
>>>twitter_json
{'text': 'this is a modified value', 'user': {'screen_name': 'TweetQuoter'}, 'quoted_status': {'text': 'this is a modified value', 'user': {'screen_name': 'OriginalTweeter'}........

Aphid.find_paths(iterable, searchfor, max_=0, searchtype='key', **kwargs):

Use find_paths to find the path to a nested key. The function returns a list of results in the format [[0,'key','key',3][1,'key']] whereas each separate list is a different match, and each component is either a key or index(depending on the data type). To find the path to the first(only) title key. We will pretend that multiple title keys exist and take advantage of the max_ key word.

results = Aphid.find_paths(twitter_json, 'title', max_=1)
>>>results
[['entities', 'urls', 0, 'unwound', 'title']]

Aphid.find_attribute_path(cls, searchfor, max_=0, searchtype='key', **kwargs):

The find_attribute_path is just like the find_paths function except it works with nested classes. We want to get the date attributes inside a requests response, but it is nested within multiple classes.

resp = requests.get('https://github.com/robertkearns/Aphid')
results = Aphid.find_attribute_path(resp, 'date')
>>>results
[['headers', '_store', 'date'], ['raw', 'headers', '_container', 'date']]

Now we see where it resides, and if we call:

>>>resp.headers._store['date']
('Date', 'Sat, 13 Jul 2019 12:54:55 GMT')