AI Acceleratorfor AI CCTV
iSpur NPU
Overview
Dedicated accelerator for AI
The Supergate NPU is an AI processor designed for high-performance vision AI and generative AI inference. Based on Supergate's custom ISA and memory architecture, it is optimized for high-speed on-device processing of various computer vision and natural language processing models.
Key Features
Up to 64 TOPS / 32 TFLOPS performance
AI computational accelerators and parallel architectures simultaneously process high-resolution image analysis and large-scale language model inference.
Multi-channel simultaneous processing
Parallel analysis of multi-channel real-time video streams at 15 to 30 FPS to accurately detect more objects.
Support for language models (sLM)
Supports generative AI inference with our own AI compiler and low-level (C/C++) optimization backend.
Designed for power efficiency
By optimizing data locality using embedded SRAM memory in a hierarchical structure, DRAM access is minimized, enabling low-power, high-efficiency computation.
On-device inference optimization
It supports data types such as FP16, FP8, and INT8, and can be optimized through quantization.
AI-specific ISA implementation
- Users can directly control operators specialized for generative AI operations.
- Supports open source and extensible hardware ISA design.
Various AI Model Zoo
We help you effectively develop and deploy AI applications by providing a Zoo of diverse AI models ported to SuperGate NPUs and proven for performance.
- Providing on-device serving by directly optimizing the PyTorch-based GGUF model.
- Supports over 60 of the latest SLM models
AI development environment
A developer-friendly development environment, including an AI compiler, helps achieve effective development and deployment of various AI models.
Specifications
AI processing capabilities
The Supergate NPU's AI vision processor provides optimized generative AI-based language models along with various vision processing and video processing algorithms.
Memory architecture
Maximizes data transfer speeds and minimizes power consumption through a unique memory architecture.
AI development environment and ecosystem
- Includes model conversion tools: Supports various formats including GGUF, HuggingFace, safetensors, and pytorch.
- Unified SDK: Command-line API, performance analysis tools, debugging tools
Support Model
Vision Model (Image-Based AI)
- YOLOv5/v8
- ViT (Vision Transformer)
- DeepSORT
- SAM (Segment Anything)
Generative AI Model (LLM)
- LLaMA 2/3
- TinyLlama
- Mistral
- Qwen
- CodeLlama
- Baichuan, Gemma, etc.
Application Cases
Smart City
CCTV object detection + automatic generation of natural language descriptions
Industrial sites
Equipment status recognition + maintenance alert generation
Drone Analyzer
Real-time video analysis and command-based narrative generation
Cashierless Store
Behavior detection + automatic product description provision
Performance
Computational performance
Up to 64 TOPS / 32TFLOPS
Image inference
8 channels @ 30FPS
Language processing speed
Up to 420 tokens/sec (based on LLaMA 2 7B)
Power consumption
About 3~5W (based on edge camera environment)
Additional Support
01
PCIe-based host integration
(x86 host → NPU command transfer)
02
LLM quantization and fixed-point support
03
Model Zoo continuous updates and 24/7 support