Main Interface Functions

Create Default Parameter

xlm_common_params_t xlm_create_default_param();

Create default model general parameters

  • Return Value

    • Construct the xlm_common_params_t structure data with default parameter.

Initialize the Instance

int xlm_init(xlm_common_params_t *param, xlm_callback_t callback, void **llm_handle);

Initialize the instance.

  • Parameters

    • [in]: param, general parameters of the model generated during initialization.

    • [in]: callback, the callback function pointer for registering a task, i.e., the execution entity of the task.

    • [out]: llm_handle, the inference handle, which is used for the management of subsequent tasks.

  • Return Value

    • 0 (Initialization successful), -1 (Initialization failed).

Synchronous Inference

int xlm_infer(xlm_handle_t handle, xlm_input_t *input, void *userdata);

Synchronous inference, starting the inference includes a complete prefill and decode process.

  • Parameters

    • [in]: handle, the inference handle obtained through the xlm_init interface.

    • [in]: input, the model inference input, including data such as prompt, image, and task priority.

    • [in]: userdata, user-defined data, which is returned through the callback function along with the inference result.

  • Return Value

    • 0 (Inference task executed successfully), -1 (Failed to obtain the inference handle, task returned).

PPL Calculation

int xlm_ppl(xlm_handle_t handle, xlm_input_t *input, void *userdata);

This is used only for PPL calculation and will not be executed for regular tasks.

  • Parameters

    • [in]: handle, the inference handle obtained through the xlm_init interface.
    • [in]: input, input for model inference, typically text or wikitest data.
    • [in]: userdata,user-defined data, which is returned through the callback function along with the inference result.
  • Return Value

    • 0 (PPL calculation task executed successfully), -1 (failed to obtain the inference handle, task returned).

Asynchronous Inference

int xlm_infer_async(xlm_handle_t handle, xlm_input_t *input, void *userdata);

Asynchronous inference, starting the inference includes a complete prefill and decode process.

  • Parameters

    • [in]: handle, the inference handle obtained through the xlm_init interface.

    • [in]: input, the model inference input, including data such as prompt, image, and task priority.

    • [in]: userdata, user-defined data, which is returned through the callback function along with the inference result.

  • Return Value

    • 0 (Inference task executed successfully), -1 (Failed to obtain the inference handle, task returned).

Destroy the Instance

int xlm_destroy(xlm_handle_t *handle);

Release the inference instance resources.

  • Parameters

    • [in]: handle, the inference handle obtained through the xlm_init interface.
  • Return Value

    • 0 (Task destroyed successfully), -1 (Failed to obtain the inference handle, interface returned).

Omni Audio Input

int xlm_omni_feed_audio_online(xlm_handle_t handle, omni_online_audio_t audio_input);

Provide audio input when Omni is running online.

  • Parameters:

    • [in]: handle, the inference handle obtained through the xlm_init interface.
    • [in]: audio_input, audio input, including the memory start address and length information.
  • Return Value:

    • 0 (Correctly transmit audio data), -1 (Failed to obtain audio data, task returned).

Omni Video Input

int xlm_omni_feed_video_online(xlm_handle_t handle, omni_online_video_t video_input);

Provide video input when Omni is running online.

  • Parameters:

    • [in]: handle, the inference handle obtained through the xlm_init interface.
    • [in]: video_input, video input, including the start addresses of the Y and UV components and their width and height.
  • Return Value:

    • 0 (Correctly transmit video data), -1 (Failed to obtain video data, task returned).

Omni Text Input

int xlm_omni_feed_text_online(xlm_handle_t handle, omni_online_text_t text_input);

Provide text input when Omni is running online.

  • Parameters:

    • [in]: handle, the inference handle obtained through the xlm_init interface.
    • [in]: text_input, text input, including system text and user text.
  • Return Value:

    • 0 (Correctly transmit text data), -1 (Failed to obtain text data, task returned).

Omni Synchronous Inference

int xlm_omni(xlm_handle_t handle, xlm_input_t *input, void *userdata);

Start Omni's full processing pipeline in a synchronous manner.

  • Parameters:

    • [in]: handle, the inference handle obtained through the xlm_init interface.
    • [in]: input, the model inference input.
    • [in]: userdata, user-defined data, which is returned through the callback function along with the inference result.
  • Return Value:

    • 0 (Inference task executed successfully), -1 (Failed to obtain the inference handle, task returned).