aoptk.literature.databases.pmc
Classes
Class for retrieving and parsing open access PMC publications. |
Module Contents
- class aoptk.literature.databases.pmc.PMC(query: str, storage: str, figure_storage: str)[source]
Bases:
aoptk.literature.get_publication.GetPublication,aoptk.literature.get_pdf.GetPDF,aoptk.literature.get_id.GetIDClass for retrieving and parsing open access PMC publications.
- get_pdfs() list[aoptk.literature.pdf.PDF][source]
Retrieve PDFs based on the query.
- get_publications() list[aoptk.literature.publication.Publication][source]
Get a list of publications.
- Returns:
A list of Publication objects.
- Return type:
- async get_ids() list[aoptk.literature.id.ID][source]
Retrieve a list of publication IDs based on the query.
- _get_publication(publication_id: str) aoptk.literature.publication.Publication[source]
Parse a single PDF and return a Publication object.
- Parameters:
publication_id (str) – The publication ID to retrieve and parse.
- _get_full_text(publication_id: str) str | None[source]
Retrieve the full text for a given publication ID.
- Parameters:
publication_id (str) – The publication ID to retrieve the full text for.
- _get_file(publication_id: str, file_format: str) aoptk.literature.pdf.PDF | str | None[source]
Retrieve the file for a given publication ID and format.
- _get_figures(publication_id: str) list[str][source]
Retrieve the figure files for a given publication ID.
- Parameters:
publication_id (str) – The publication ID to retrieve the figure files for.
- _extract_figures_from_supplements(publication_id: str, supplementary_files: list[str]) list[str][source]
Extract figure files from the supplementary files.
- _get_json(publication_id: str) str | None[source]
Retrieve the json for a given publication ID.
- Parameters:
publication_id (str) – The publication ID to retrieve the json for.
- _get_pdf(publication_id: str) aoptk.literature.pdf.PDF | None[source]
Retrieve the PDF for a given publication ID.
- Parameters:
publication_id (str) – The publication ID to retrieve the PDF for.
- _get_publication_count_and_ids(mindate: str | None = None, maxdate: str | None = None) tuple[int, list[str]][source]