Providers are the hosts of models or services that AGiXT uses to interact with users.

Logical Language Model (LLM) Providers

We're fully aware that they're called large language models and not logical language models, but our approach to prompting and chaining uses your agents primary LLM for logical reasoning in the background. This is why we refer to them as logical language models in the context of AGiXT.

The agent's LLM provider will be the logical language model. This is generally an intelligent model that can understand and generate human-like text. The LLM provider will be used for generating responses to user input and for understanding the user's intent.

Current LLM Providers

  • OpenAI and Azure OpenAI (GPT-4, etc)
  • ezLocalai (Open source models)
  • Anthropic (Claude)
  • Google (Gemini)
  • Hugging Face (Open source models)
  • default: gpt4free (Free providers, results may vary.)

Vision Providers

Vision providers are used to get descriptions of images that are uploaded in messages, then giving that description to the LLM provider to generate a response as well as execute any commands or chains that are defined in the agent's settings.

Current Vision Providers

  • OpenAI (gpt-4-vision, etc)
  • ezLocalai (Open source models such as deepseek-vl-1.3b-chat)
  • Anthropic (Claude)
  • Google (Gemini)

Image Generation Providers

Image generation providers are used to generate images based on the users input. If the create_image flag is set to true in the message on the /v1/chat/completions endpoint, an image will be generated and sent with the agent's response.

Current Image Generation Providers

  • OpenAI (DALL-E)
  • default: Hugging Face (Stable Diffusion)

Text to Speech (TTS) Providers

Text to speech providers are used to generate speech from the agent's response. This can be used to have the agent speak to the user through the chat interface.

Current TTS Providers

  • OpenAI
  • ezLocalai (Voice cloning TTS)
  • Elevenlabs (Voice cloning TTS over API, listed under agixt Provider)
  • default: Google (gTTS)

Speech Transcription Providers

Speech transcription providers are used to transcribe speech from the user to text. This can be used to have the agent understand the user's speech input.

Current Speech Transcription Providers

  • OpenAI (Whisper)
  • ezLocalai (Open source models)
  • default: faster-whisper

Speech Translation Providers

Speech translation providers are used to translate speech from any language that the provider understands to English in text. This can be used to have the agent understand the user's speech input in a different language.

Current Speech Translation Providers

  • OpenAI (Whisper)
  • ezLocalai (Open source models)
  • default: faster-whisper

Services for Providers

There are multiple provider services for different providers for features like TTS, Audio to Text, Embeddings, and Image Generation. Each provider now has a services property which is a list of services available from that provider. Providers with an embeddings service will have an additional property for chunk_size for the embedder.

For example, the OpenAI provider has:

self.chunk_size = 1024

def services():
    return [
      "llm", # Language model
      "tts", # Text to speech
      "image", # Image generation
      "embeddings", # Embeddings creation
      "transcription", # Audio transcription to text
      "translation", # Audio translation to text in English

results matching ""

    No results matching ""