Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Load the Ontology:

    • The function reads the ontology file to extract the list of intents and slot types:

  2. Fit the Intent Label Encoder:

    • The intent encoder assigns a unique numerical label to each intent in the ontology:

      intent_label_encoder.fit(intents)

    • Key Insight: This step ensures that intent classification produces outputs in a consistent format.

  3. Generate BIO Tags for Slots:

    • Slot tags are converted into BIO format:

      • B-{slot}: Beginning of a slot entity.

      • I-{slot}: Inside a slot entity.

      • O: Outside of any slot entity.

    • All slot tags are compiled into a single list:

      Code Block
      all_slot_tags = ['O'] + [f'B-{slot}' for slot in slots.keys()]
                         + [f'I-{slot}' for slot in slots.keys()]

    • These tags are then fitted to the slot encoder:

    • Why BIO Format?: This labeling scheme helps identify the boundaries of multi-token slot entities.

      • Think about why this could be important in our context and what slots could specifically benefit.

Info

Done? Proceed with Our Dataset .