Tutorial I: Retrieving a single article¶

In this tutorial the aim is to retrieve a single article for the journal arXiv, where the word ‘Game’ is contained in the title or the abstract.

Initially, let us import Arcas:

>>> import arcas

The APIs, are implemented as classes. Here we make an API instance of the API arXiv:

>>> api = arcas.Arxiv()

We will now create the query, to which arXiv listens to. records is the number of records we are requesting for:

>>> parameters = api.parameters_fix(title='Game', abstract='Game', records=1)
>>> url = api.create_url_search(parameters)

The query will be used to ping the API and afterwards we parse the xml file that has been retrieved:

>>> request = api.make_request(url)
>>> root = api.get_root(request)
>>> raw_article = api.parse(root)
>>> article = api.to_dataframe(raw_article[0])

Note that we are using the library pandas to store the results. The data frame contains metadata on an article as they are recorded in the journal arXiv. Here we can type the following to see the columns of the data frame:

>>> article.columns
Index(['url', 'key', 'unique_key', 'title', 'author', 'abstract', 'doi',
   'date', 'journal', 'provenance', 'primary_category', 'category',
   'score', 'open_access'],
  dtype='object')

and we can ask for the title:

>>> article.title.unique()
    array([ 'A New Approach to Solve a Class of Continuous-Time Nonlinear
    Quadratic Zero-Sum Game Using ADP'], dtype=object)

Note that you might be getting a different title that me. That is fine it’s just that new articles have been added to the API’s database.

The structure of the results is discussed in depth in result set.