Skip to content

Standardised File Format

TextKGCData uses a standardised file format for knowledge graph data with textual descriptions.


File Types

entity_ids.txt

Plain text file with one entity ID per line.

Q42
Q123
Q999

entity_id2_name.json

JSON mapping entity IDs to names.

{
  "Q42": "Douglas Adams",
  "Q123": "Example Entity"
}

entity_id2_description.json

JSON mapping entity IDs to descriptions.

{
  "Q42": "Douglas Adams was an English author, best known for The Hitchhiker's Guide to the Galaxy.",
  "Q123": "This is a sample entity used for demonstration purposes."
}

relation_id2name.json

JSON mapping relation IDs to names.

{
  "P31": "instance of",
  "P279": "subclass of",
  "P50": "author"
}


Directory Structure

<dataset>_standardized_/
    entity_ids.txt
    entity_id2_name.json
    entity_id2_description.json
    relation_id2name.json