FunctionGemma 270M

FunctionGemma 270M เป็นโมเดลที่พัฒนาต่อจาก Gemma 3 ที่ถูก fine-tune มาเพื่อการเรียก tool และ function แบบมีโครงสร้าง แทนที่จะสร้างข้อความแบบอิสระ โมเดลจะสร้างไวยากรณ์ที่เข้มงวดในรูปแบบ <start_function_call>call:NAME{...}<end_function_call> ซึ่ง SDK จะแปลงเป็นค่า FunctionCall ที่มีชนิดข้อมูล ด้วยขนาดบนดิสก์ประมาณ 283 MB จึงเล็กพอที่จะโหลดควบคู่ไปกับ pipeline ASR + TTS บนฮาร์ดแวร์ระดับโทรศัพท์ และทำหน้าที่เป็น “ตัวจัดเส้นทาง” ที่เปลี่ยนคำพูดของผู้ใช้ให้กลายเป็นการเรียกใช้ tool

การเรียกฟังก์ชันบนอุปกรณ์

FunctionGemma เสียบเข้ากับเอเจนต์เสียงในจุดที่คุณจะต้องเรียก LLM แบบโฮสต์เพื่อจัดเส้นทาง tool ไวยากรณ์นี้มีโครงสร้างที่เข้มงวดโดยธรรมชาติ คุณจึงได้ออบเจกต์ FunctionCall ที่ถูกแปลงแล้วโดยตรง — ไม่ต้องซ่อมแซม JSON ไม่ต้องใช้ prompt แบบ schema-mode

แพลตฟอร์ม

แพลตฟอร์ม	รูปแบบ	ขนาด	HuggingFace
Apple (macOS / iOS)	CoreML	~283 MB	aufklarer/FunctionGemma-270M-CoreML
Android (และ Linux / Windows ผ่าน Speech Core)	LiteRT-LM	~283 MB	soniqo/FunctionGemma-270M-LiteRT-LM

ไวยากรณ์

โมเดลถูกฝึกให้สร้างการเรียกใช้ครั้งเดียว (หรือชุดของการเรียก) ที่ถูกห่อด้วย token เซนทิเนลสองตัว:

<start_function_call>call:set_timer{"minutes": 5, "label": "tea"}<end_function_call>

SDK จะแปลงการเรียกแต่ละครั้งเป็นค่า FunctionCall(name:, arguments:) ที่มีชนิดข้อมูล อาร์กิวเมนต์จะถูกถอดรหัสเป็น JSON ซึ่งคุณสามารถ map ลงบน Codable ของ Swift หรือ data class แบบ @Serializable ของ Kotlin ได้โดยตรง

Swift (Apple, CoreML)

บนแพลตฟอร์ม Apple FunctionGemma เปิดให้ใช้งานผ่าน speech-swift ในชื่อคลาส FunctionGemma โดยจะโหลดโมเดล CoreML จาก HuggingFace เมื่อใช้งานครั้งแรกและรันบน Neural Engine

import FunctionGemma

let model = try await FunctionGemma.fromPretrained()

let tools = """
- set_timer(minutes: Int, label: String)
- get_weather(city: String)
"""

let calls = try model.callFunctions(
    tools: tools,
    userMessage: "Set a 5 minute tea timer"
)

for call in calls {
    print(call.name)       // "set_timer"
    print(call.arguments)  // ["minutes": 5, "label": "tea"]
}

Kotlin (Android, LiteRT-LM)

บน Android โมเดลถูกเปิดให้ใช้งานผ่าน speech-android ในชื่อ audio.soniqo.speech.llm.FunctionGemma เป็น adapter แบบ bring-your-own-runtime: คุณส่งอินสแตนซ์ของ LiteRtLmRuntime (SDK มาพร้อมตัว default) และ FunctionGemma จะจัดการ prompt template และการแปลงไวยากรณ์ให้

import audio.soniqo.speech.llm.FunctionGemma
import audio.soniqo.speech.llm.LiteRtLmRuntime

val runtime = LiteRtLmRuntime.fromPretrained(context)
val model = FunctionGemma(runtime)

val tools = """
- set_timer(minutes: Int, label: String)
- get_weather(city: String)
""".trimIndent()

val calls = model.callFunctions(
    tools = tools,
    userMessage = "Set a 5 minute tea timer",
)

for (call in calls) {
    println(call.name)       // "set_timer"
    println(call.arguments)  // {"minutes": 5, "label": "tea"}
}

อ่านเพิ่มเติม

Speech Core — เอนจิน C++ ที่โฮสต์ runtime ของ LiteRT-LM และลูปการเรียกฟังก์ชันบน Linux, Windows และ Android
github.com/soniqo/speech-core — แกนการประสานงานและกาวเชื่อม LiteRT-LM รวมถึงลูปการเรียก tool ของ VoicePipeline
github.com/soniqo/speech-swift — SDK ของ Apple ที่มีคลาส Swift ชื่อ FunctionGemma
github.com/soniqo/speech-android — SDK ของ Android ที่มี audio.soniqo.speech.llm.FunctionGemma
google/gemma-3-270m — โมเดลฐาน Gemma 3 270M ต้นทางบน HuggingFace