Skip to content

FydeOS AI Assistant

Last Update: 2024-05-30

What is FydeOS AI?

FydeOS AI is a system-level AI assistant specifically designed for FydeOS. It supports integration with mainstream Artificial Intelligence Generated Content (AIGC) services and has the capability to run local models.

Getting Started with FydeOS AI

System Requirements

Currently, FydeOS AI is in the experimental stage and is only available on openFyde. The supported devices are:

  • openFyde - amd64
  • rpi5-openfyde
  • fydetab_duo-openfyde
  • rock5b-openfyde
  • orangepi5-openfyde
  • edge2-openfyde

Setup Steps

FydeOS AI Third-Party API

  1. Click on the launcher at the bottom left corner, find FydeOS AI in the applications, and click to launch it.
  2. In the basic settings of the FydeOS AI settings page, select the service provider and fill in the relevant information. Once completed, you can use the cloud-based large language model for conversations.

FydeOS AI Local Model

  1. Go to Hugging Face, choose the model you want to use, and download the file XXX.rkllm.
  2. Launch FydeOS AI, go to Settings - Basic Settings, and select the local model.
  3. Enter the model path. If the model file is in the Downloads folder, the path will be Downloads/xxx.rkllm. If the file is saved elsewhere, the path will be the relative path under โ€œMy Filesโ€.
  4. Click to start the local model, wait a moment, and the local model will be ready for conversations.

How to Use?

FydeOS AI is very user-friendly for regular users:

  • You can freely ask FydeOS AI for help in the chat box and view past conversations through the history on the left side.
  • You can use voice input to quickly seek assistance.
  • When text is selected, press the shortcut Ctrl+C+C (hold Ctrl, then press C twice within two seconds) to copy the current content and ask FydeOS AI in a pop-up window.

Developer Options

Configuration File

File PathDescription
config/config.yamlSets default parameters for loading the model
config/fix_freq_rk3588.shIncreases CPU and NPU frequency when running the model

Parameter Descriptions

Parameter NameTypeDescriptionDefault
modelPathconst char*Path to the model file-
target_platformconst char*The hardware platform for running the model, options include โ€œrk3576โ€ or โ€œrk3588โ€-
num_npu_coreint32_tNumber of NPU cores to use during inference. For โ€œrk3576โ€, the range is [1, 2]; for โ€œrk3588โ€, the range is [1, 3]-
max_context_lenint32_tSets the context size for the prompt-
max_new_tokensint32_tSets the upper limit on the number of tokens generated during model inference-
top_kint32_tTop-k sampling selects the next token from the top k most likely tokens predicted by the model. This reduces the risk of generating low-probability or meaningless tokens. Higher values (e.g., 100) consider more token options, leading to more diverse text; lower values (e.g., 10) focus on the most likely tokens, producing more conservative text.40
top_pfloatTop-p sampling, also known as nucleus sampling, selects the next token from a set of tokens with cumulative probability at least p. This balances diversity and quality by considering token probability and the number of tokens sampled. Higher values (e.g., 0.95) result in more diverse text; lower values (e.g., 0.5) produce more focused and conservative text.0.9
temperaturefloatControls the randomness of text generation by adjusting the probability distribution of model output tokens. Higher temperatures (e.g., 1.5) make the output more random and creative; lower temperatures (e.g., 0.5) make the output more focused and conservative. At a temperature of 0, the model always chooses the most likely next token, producing identical output each time.0.8
repeat_penaltyfloatControls the repetition of token sequences in the generated text, helping to prevent repetitive or monotonous output. Higher values (e.g., 1.5) penalize repetition more strongly, while lower values (e.g., 0.9) are more lenient.1.1
frequency_penaltyfloatPenalizes the use of frequently occurring words/phrases and increases the probability of less common ones. This can make the generated text more diverse but may also lead to incoherent or unexpected results. Range is [-2.0, 2.0].0
mirostatint32_tAn algorithm that maintains the quality of the generated text within the desired range. It balances coherence and diversity, avoiding low-quality output due to excessive repetition (boredom trap) or incoherence (confusion trap). Values: {0, 1, 2}, with 0 disabling the algorithm, 1 enabling mirostat, and 2 enabling mirostat 2.0.-
mirostat_taufloatSets the target entropy for mirostat, representing the desired perplexity value of the generated text. Lower values result in more focused and coherent text, while higher values result in more diverse but potentially less coherent text.5.0
mirostat_etafloatSets the learning rate for mirostat. Lower learning rates result in slower adjustments, while higher learning rates make the algorithm more responsive.0.1

Cited from Rockchip_RKLLM_SDK_CN.pdf