FydeOS AI Assistant

Last Update: 2024-05-30

What is FydeOS AI?

FydeOS AI is a system-level AI assistant specifically designed for FydeOS. It supports integration with mainstream Artificial Intelligence Generated Content (AIGC) services and has the capability to run local models.

Getting Started with FydeOS AI

System Requirements

Currently, FydeOS AI is in the experimental stage and is only available on openFyde. The supported devices are:

openFyde - amd64
rpi5-openfyde
fydetab_duo-openfyde
rock5b-openfyde
orangepi5-openfyde
edge2-openfyde

Setup Steps

FydeOS AI Third-Party API

Click on the launcher at the bottom left corner, find FydeOS AI in the applications, and click to launch it.
In the basic settings of the FydeOS AI settings page, select the service provider and fill in the relevant information. Once completed, you can use the cloud-based large language model for conversations.

FydeOS AI Local Model

Go to Hugging Face, choose the model you want to use, and download the file XXX.rkllm.
Launch FydeOS AI, go to Settings - Basic Settings, and select the local model.
Enter the model path. If the model file is in the Downloads folder, the path will be Downloads/xxx.rkllm. If the file is saved elsewhere, the path will be the relative path under “My Files”.
Click to start the local model, wait a moment, and the local model will be ready for conversations.

How to Use?

FydeOS AI is very user-friendly for regular users:

You can freely ask FydeOS AI for help in the chat box and view past conversations through the history on the left side.
You can use voice input to quickly seek assistance.
When text is selected, press the shortcut Ctrl+C+C (hold Ctrl, then press C twice within two seconds) to copy the current content and ask FydeOS AI in a pop-up window.

Developer Options

Configuration File

File Path	Description
`config/config.yaml`	Sets default parameters for loading the model
`config/fix_freq_rk3588.sh`	Increases CPU and NPU frequency when running the model

Parameter Descriptions

Parameter Name	Type	Description	Default
`modelPath`	`const char*`	Path to the model file	-
`target_platform`	`const char*`	The hardware platform for running the model, options include “rk3576” or “rk3588”	-
`num_npu_core`	`int32_t`	Number of NPU cores to use during inference. For “rk3576”, the range is [1, 2]; for “rk3588”, the range is [1, 3]	-
`max_context_len`	`int32_t`	Sets the context size for the prompt	-
`max_new_tokens`	`int32_t`	Sets the upper limit on the number of tokens generated during model inference	-
`top_k`	`int32_t`	Top-k sampling selects the next token from the top k most likely tokens predicted by the model. This reduces the risk of generating low-probability or meaningless tokens. Higher values (e.g., 100) consider more token options, leading to more diverse text; lower values (e.g., 10) focus on the most likely tokens, producing more conservative text.	40
`top_p`	`float`	Top-p sampling, also known as nucleus sampling, selects the next token from a set of tokens with cumulative probability at least p. This balances diversity and quality by considering token probability and the number of tokens sampled. Higher values (e.g., 0.95) result in more diverse text; lower values (e.g., 0.5) produce more focused and conservative text.	0.9
`temperature`	`float`	Controls the randomness of text generation by adjusting the probability distribution of model output tokens. Higher temperatures (e.g., 1.5) make the output more random and creative; lower temperatures (e.g., 0.5) make the output more focused and conservative. At a temperature of 0, the model always chooses the most likely next token, producing identical output each time.	0.8
`repeat_penalty`	`float`	Controls the repetition of token sequences in the generated text, helping to prevent repetitive or monotonous output. Higher values (e.g., 1.5) penalize repetition more strongly, while lower values (e.g., 0.9) are more lenient.	1.1
`frequency_penalty`	`float`	Penalizes the use of frequently occurring words/phrases and increases the probability of less common ones. This can make the generated text more diverse but may also lead to incoherent or unexpected results. Range is [-2.0, 2.0].	0
`mirostat`	`int32_t`	An algorithm that maintains the quality of the generated text within the desired range. It balances coherence and diversity, avoiding low-quality output due to excessive repetition (boredom trap) or incoherence (confusion trap). Values: {0, 1, 2}, with 0 disabling the algorithm, 1 enabling mirostat, and 2 enabling mirostat 2.0.	-
`mirostat_tau`	`float`	Sets the target entropy for mirostat, representing the desired perplexity value of the generated text. Lower values result in more focused and coherent text, while higher values result in more diverse but potentially less coherent text.	5.0
`mirostat_eta`	`float`	Sets the learning rate for mirostat. Lower learning rates result in slower adjustments, while higher learning rates make the algorithm more responsive.	0.1

Cited from Rockchip_RKLLM_SDK_CN.pdf