Skip to content

How to Import Data in Labelo

Importing data into Labelo is a straightforward process that allows you to start annotating quickly. Here’s a guide to help you understand the types of data you can import and the methods available.

Important

Ensure your data files meet the required formats specified by Labelo to avoid import errors.

Types of Data Supported

Data TypeSupported Formats
Text Files.txt
Audio Files.wav, .mp3, .flac, .m4a, .ogg
Video Files.mp4, .mpeg4, .webp, .webm
Image Files.jpg, .jpeg, .png, .gif, .bmp, .svg, .webp
HTML Files.html, .htm, .xml
Time Series Data.csv, .tsv
Common Formats.csv, .tsv, .json

Import Methods

Labelo offers several convenient ways to import your data:

  1. Direct Upload:
    • Use the web interface to upload files directly from your computer.
    • You can drag and drop files or click to browse and select them.
  2. Cloud Storage Integration:
    • Connect to cloud services like AWS S3, Google Cloud Storage, or Azure Blob Storage.
    • This method allows you to import large datasets stored in the cloud without needing to download them first.
  3. URLs:
    • Import data directly from publicly accessible URLs.
    • This is useful for web scraping or using online datasets.

Import data

Basic JSON Format for Labelo

For seamless data import into Labelo, it is advisable to use a list of tasks formatted in JSON. This format necessitates that the JSON file contains a data key that directs to each task as an entry within a JSON dictionary. If the data key is missing, Labelo will consider the entire JSON file as a single task.

Within the data JSON dictionary, you should employ key-value pairs that align with the source keys expected by the object tags defined in your project’s label configuration.

Labelo interprets field values differently based on the object tag used:

  • <Text value="$key">: Interprets the value as plain text.
  • <HyperText value="$key">: Interprets the value as HTML markup.
  • <HyperText value="$key" encoding="base64">: Interprets the value as base64 encoded HTML markup.
  • <Audio value="$key">: Expects the value to be a valid URL for an audio file, with CORS policy enabled on the server.
  • <Image value="$key">: Expects the value to be a valid URL for an image file.
  • <TimeSeries value="$key">: Interprets the value as a valid URL to a CSV/TSV file if valueType is set to "url." If valueType is "json," it is interpreted as a JSON dictionary with arrays for each column: "value": {"first_column": [...], ...}.
json
{
  "data": [
    {
      "text": "This is a sample text for annotation.",
      "image": "https://example.com/image1.jpg",
      "audio": "https://example.com/audio1.wav"
    },
    {
      "text": "Another sample text for labeling.",
      "image": "https://example.com/image2.jpg",
      "audio": "https://example.com/audio2.wav"
    },
    {
      "text": "More text data to annotate.",
      "image": "https://example.com/image3.jpg",
      "audio": "https://example.com/audio3.wav"
    }
  ],
  "annotations": [
    {
      "id": 1,
      "result": [
        {
          "from_name": "label",
          "to_name": "text",
          "type": "textarea",
          "value": {
            "text": ["Annotated text for the first entry."],
            "start": 0,
            "end": 30
          }
        }
      ]
    }
  ]
}

data: An array of tasks where each task can contain different types of data (text, image, audio).

annotations: An optional array that includes annotations for the tasks, detailing how the data has been labeled.

You can adjust the keys and values based on the specific requirements of your labeling project.

Import Workflow

You can import datasets in Labelo in two main ways:

  1. From the Project Page: When creating a new project, you can import data right away.

  2. From the Task Page: Within an existing project, click the import button to add new data.

Tips for a Smooth Import Process

  • Check File Formats: Make sure your files are in supported formats (e.g., JPEG for images, TXT for text).
  • Organize Your Data: Keeping your files well-organized on your computer can save you time during the import process.
  • Use Cloud Services: For large datasets, consider using cloud storage for easier access and faster imports.

By following these steps, you can easily import your data into Labelo and get started with your annotation projects!