aoptk.text_generation_api

Attributes

topics

Exceptions

LLMFailureError

Base class for capturing LLM failures.

Classes

TextGenerationAPI

Text generation API using OpenAI.

Module Contents

aoptk.text_generation_api.topics[source]
exception aoptk.text_generation_api.LLMFailureError[source]

Bases: Exception

Base class for capturing LLM failures.

class aoptk.text_generation_api.TextGenerationAPI(model: str = 'gpt-oss-120b', url: str = 'https://llm.ai.e-infra.cz/v1', api_key: str = os.environ.get('CERIT_API_KEY'))[source]

Bases: aoptk.find_chemical.FindChemical, aoptk.relationships.find_relationship.FindRelationship, aoptk.normalization.normalize_chemical.NormalizeChemical, aoptk.literature.convert_pdf_scan.ConvertPDFScan, aoptk.literature.convert_image.ConvertImage, aoptk.literature.find_relevant_publication.FindRelevantPublication

Text generation API using OpenAI.

role: str = 'user'[source]
temperature: float = 0[source]
top_p: float = 1[source]
client: None = None[source]
prompts_dir: pathlib.Path[source]
chemical_prompt_template: str = 'chemical_prompt.txt'[source]
relationship_text_prompt_template: str = 'relationship_text_prompt.txt'[source]
relationship_text_images_prompt_template: str = 'relationship_text_images_prompt.txt'[source]
relationships_table_prompt_template: str = 'relationships_table_prompt.txt'[source]
normalization_prompt_template: str = 'normalization_prompt.txt'[source]
convert_pdf_scan_prompt_template: str = 'convert_pdf_scan_prompt.txt'[source]
convert_image_prompt_template: str = 'convert_image_prompt.txt'[source]
find_relevant_publications_prompt_template: str = 'find_relevant_publications_prompt.txt'[source]
specification_relationship_text_prompt: str = ''[source]
model = 'gpt-oss-120b'[source]
url = 'https://llm.ai.e-infra.cz/v1'[source]
api_key[source]
find_relationships_in_text(text: str, chemicals: list[aoptk.chemical.Chemical], effects: list[aoptk.effect.Effect], relationship_type: aoptk.relationship_type.RelationshipType) list[aoptk.relationships.relationship.Relationship][source]

Find relationships between chemicals and effects.

Parameters:
  • text (str) – The input text.

  • chemicals (list[Chemical]) – List of chemical entities.

  • effects (list[Effect]) – List of effect entities.

  • relationship_type (RelationshipType) – The relationship type to classify.

_relationship_prompt(text: str, chemical: aoptk.chemical.Chemical, effect: aoptk.effect.Effect, relationship_type: aoptk.relationship_type.RelationshipType) str[source]

Classify the relationship between a chemical and an effect.

Parameters:
  • text (str) – The input text.

  • chemical (Chemical) – The chemical entity.

  • effect (Effect) – The effect entity.

  • relationship_type (RelationshipType) – The relationship type to classify.

_render_prompt(template_name: str, **context: object) str[source]
_prompt(content: str) str[source]
_select_relationship_type(response: str, relationship_type: aoptk.relationship_type.RelationshipType) str | None[source]

Select the relationship type based on the response.

Parameters:
  • response (str) – The response from the model indicating the relationship type.

  • relationship_type (RelationshipType) – The relationship type to classify.

find_chemicals(text: str) list[aoptk.chemical.Chemical][source]

Find chemicals in the given text.

Parameters:

text (str) – The input text to search for chemicals.

_encode_image(image_path: str) tuple[str, str][source]

Encode the image at the given path to a base64 string and return MIME type.

Parameters:

image_path (str) – The path to the image to encode.

Returns:

A tuple of (base64_encoded_image, mime_type).

Return type:

tuple[str, str]

_process_colon_separated_response(response: str, effect: aoptk.effect.Effect, relationship_type: aoptk.relationship_type.RelationshipType, image_path: str) list[aoptk.relationships.relationship.Relationship][source]

Process the response from the model that is colon seperated.

Parameters:
  • response (str) – The response from the model.

  • effect (Effect) – The effect entity.

  • relationship_type (RelationshipType) – The relationship type to classify.

  • context (str) – The path to the image, used for context in the relationship.

  • image_path (str) – The path to the image, used for context in the relationship.

find_relationships_in_table(table_df: pandas.DataFrame, effects: list[aoptk.effect.Effect], relationship_type: aoptk.relationship_type.RelationshipType) list[aoptk.relationships.relationship.Relationship][source]

Find relationships between chemicals and effects in a table.

Parameters:
  • table_df (pd.DataFrame) – Pandas DataFrame.

  • relationship_type (RelationshipType) – The relationship type to classify.

  • effects (list[Effect]) – List of effect entities.

_classify_relationships_in_table(table_df: pandas.DataFrame, effect: aoptk.effect.Effect, relationship_type: aoptk.relationship_type.RelationshipType) list[aoptk.relationships.relationship.Relationship][source]

Classify relationships between chemicals and an effect in a table.

Parameters:
  • table_df (pd.DataFrame) – Pandas DataFrame.

  • effect (Effect) – The effect entity.

  • relationship_type (RelationshipType) – The relationship type to classify.

Returns:

List of relationships found in the table.

Return type:

list[Relationship]

normalize_chemical(chemical: aoptk.chemical.Chemical, chemical_list: list[aoptk.chemical.Chemical]) aoptk.chemical.Chemical[source]

Normalize the chemical name by finding a matching name in the chemical list.

Parameters:
  • chemical (Chemical) – The chemical to normalize.

  • chemical_list (list[Chemical]) – The list of chemicals to match against.

Returns:

The normalized chemical.

Return type:

Chemical

_find_matching_name(chemical: aoptk.chemical.Chemical, chemical_list: list[aoptk.chemical.Chemical]) aoptk.chemical.Chemical | None[source]

Find a matching chemical name in the chemical list.

Parameters:
  • chemical (Chemical) – The chemical to find a match for.

  • chemical_list (list[Chemical]) – The list of chemicals to match against.

Returns:

The matching chemical name, or None if no match is found.

Return type:

Chemical

convert_pdf_scan(img_base64: str, mime_type: str) str[source]

Extract text from a base64-encoded image.

Parameters:
  • img_base64 (str) – Base64-encoded image data.

  • mime_type (str) – MIME type of the image. Defaults to “image/jpeg”.

Returns:

Extracted text from the image.

Return type:

str

find_relationships_in_text_and_images(text: str, image_paths: list[str], relationship_type: aoptk.relationship_type.RelationshipType, effects: list[aoptk.effect.Effect]) list[aoptk.relationships.relationship.Relationship][source]

Find relationships between chemicals and effects in the given text and images combined.

Parameters:
  • text (str) – The input text.

  • image_paths (list[str]) – List of paths to images.

  • relationship_type (RelationshipType) – The relationship type to classify.

  • effects (list[Effect]) – List of effect entities.

_classify_relationships_in_text_and_images(text: str, image_paths: list[str], effect: aoptk.effect.Effect, relationship_type: aoptk.relationship_type.RelationshipType) list[aoptk.relationships.relationship.Relationship][source]

Classify relationships between chemicals and an effect in the given text and images combined.

Parameters:
  • text (str) – The input text.

  • image_paths (list[str]) – List of paths to images.

  • effect (Effect) – The effect entity.

  • relationship_type (RelationshipType) – The relationship type to classify.

convert_image(image_path: str, text: str) str[source]

Convert an image to text.

Parameters:
  • image_path (str) – Path to the image.

  • text (str) – The full text of the publication for context.

find_relevant_publications(question: str, text: str) bool | None[source]

Answer the question based on a given text.

Parameters:
  • question (str) – The question to search for relevant publications.

  • text (str) – The extracted text of the publication.