Releases · DocNow/twarc

15 Nov 16:12

edsu

v2.8.1

51b31cd

v2.8.1

v2.8.1 includes a small update to the twarc search --help message that links
to Twitter's Building Queries for Search Tweets to help users figure out what's
possible.

https://developer.twitter.com/en/docs/twitter-api/tweets/search/integrate/build-a-query

Assets 2

23 Oct 20:46

edsu

v2.8.0

a5209e6

v2.8.0

v2.8.0 adds some new controls for shaping the data that is returned from the Twitter API. The default behavior is for twarc to retrieve the fullest
representation of a tweet by requesting all tweet, user, media, place and poll fields as well as all available expansions. This is generally good practice with twarc because it means that downstream processing of the collected data can rely on have all this data at its disposal. However there may be cases where you want to customize the data that comes back. This is not recommended practice but it could be useful in some contexts.

The following options allow you to fine tune the types of data that are requested when using the following sub-commands: search, searches, tweet, sample, hydrate, users, mentions, timeline, timelines, conversation, conversations, and stream. The options include:

  --expansions TEXT               Comma separated list of expansions to
                                  retrieve. Default is all available.
  --tweet-fields TEXT             Comma separated list of tweet fields to
                                  retrieve. Default is all available.
  --user-fields TEXT              Comma separated list of user fields to
                                  retrieve. Default is all available.
  --media-fields TEXT             Comma separated list of media fields to
                                  retrieve. Default is all available.
  --place-fields TEXT             Comma separated list of place fields to
                                  retrieve. Default is all available.
  --poll-fields TEXT              Comma separated list of poll fields to
                                  retrieve. Default is all available.

These correspond to the API Fields and Expansions.

There is also --minimal-fields which requests just a minimal subset of data, and --no-context-annotations that does not include context-annotations, which allows more tweets to be fetched at one time (500 instead of 100). This also applies to the sub-commands: search, searches, tweet, sample, hydrate, users, mentions, timeline, timelines, conversation, conversations, stream.

  --minimal-fields                By default twarc gets all available data.
                                  This option requests the minimal retrievable
                                  amount of data - only IDs and object
                                  references are retrieved. Setting this makes
                                  --max-results 500 the default. NOTE: This
                                  argument is mutually exclusive with
                                  arguments: [--counts-only, --poll-fields,
                                  --media-fields, --expansions, --no-context-
                                  annotations, --place-fields, --user-fields,
                                  --tweet-fields].

Assets 2

10 Oct 04:02

edsu

v2.7.3

2d2d210

v2.7.3

A bugfix release to apply black formatting rules.

Assets 2

10 Oct 02:19

edsu

v2.7.2

c417be6

v2.7.2

A bugfix release to apply black formatting rules.

Assets 2

10 Oct 00:40

edsu

v2.7.1

c417be6

v2.7.1

Add start-time/since-id parameters in the timeline CLI command to the timelines CLI command.
Ensure that sample command only writes JSON on stdout.

Assets 2

04 Oct 13:36

edsu

v2.7.0

eb1b769

v2.7.0

v2.7.0 adds a new places command to search for places and their identifiers, which can be used in search and stream queries. Even though it's still on the 1.1 endpoint the 1.1/geo/search.json API endpoint makes these place identifiers available when searching by the name, geo coordinates, or ip address.

Usage: twarc2 places [OPTIONS] VALUE [OUTFILE]

  Search for places by place name, geo coordinates or ip address.

Options:
  --type [name|geo|ip]            How to search for places (defaults to name)
  --granularity [neighborhood|city|admin|country]
                                  What type of places to search for (defaults
                                  to neighborhood)
  --max-results INTEGER           Maximum results to return
  --json                          Output raw JSON response
  --help                          Show this message and exit.

There is a corresponding method twarc.client2.Twarc2.geo() method which you
can use to do the lookup yourself from Python.

Assets 2

27 Sep 23:49

SamHames

v2.6.0

decd541

v2.6.0

Adds the searches CLI command for running multiple searches from an input file
Makes progress reporting more accurate for commands that consume files one line at a time (users, conversations, hydrate etc)

Assets 2

22 Sep 13:00

edsu

v2.5.0

3e40ba2

v2.5.0

This release includes new functionality for working with Twitter's new Batch Compliance API which allow you to upload large datasets of Tweet or user IDs to retrieve their compliance status in order to determine what data requires action in order to bring your datasets into compliance.

Usage: twarc2 compliance-job [OPTIONS] COMMAND [ARGS]...

  Create, retrieve and list batch compliance jobs for Tweets and Users.

Options:
  --help  Show this message and exit.

Commands:
  create    Create a new compliance job and upload tweet IDs.
  download  Download the compliance job with the specified ID.
  get       Returns status and download information about the job ID.
  list      Returns a list of compliance jobs by job type and status.

Assets 2

18 Aug 08:51

edsu

v2.4.3

f926531

v2.4.3

new release

Assets 2

17 Aug 09:46

edsu

v2.4.2

4360dda

v2.4.2

This release ensures that the timeline, timelines, conversation and conversations commands default to a --start-time of 2006-03-21 (the first day of tweets) when being instructed to use the /tweets/search/all endpoint behind the scenese. For example when doing:

twarc2 timeline --use-search jack

or:

twarc2 conversation --archive 21

Previously it was defaulting to the last 30 days (which is an unfortunate default set by the /tweets/search/all endpoint). Many thanks to Darren Halpin and @SamHames for identifying and fixing the issue!

Contributors

SamHames

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributors

Releases: DocNow/twarc

v2.8.1

v2.8.0

v2.7.3

v2.7.2

v2.7.1

v2.7.0

v2.6.0

v2.5.0

v2.4.3

v2.4.2

Contributors