포지션 상세
페블스퀘어는 PIM아키텍처를 기반으로 AI 반도체 설계와 AI 솔루션을 개발하는 팹리스 기업입니다. 폰 노이만(Von Neumann)구조의 한계를 극복하기 위해 고성능/초저전력의 PIM기반 AI 반도체 를 성공적으로 양산하고, 멀티 코어 AI 반도체를 개발하여 국내외에서 실증 중이며 AI 반도체 상용화와 활용 촉진을 위해 다양한 AI 솔루션을 개발 중입니다.
• Model Compilation Pipeline: Design and implement compilers that translate AI models (ONNX, TensorFlow, PyTorch, etc.) into executable formats for AI accelerators and edge devices.
• Graph Optimization: Apply operator fusion, pruning, quantization, and memory optimizations to improve model performance.
• Hardware Acceleration: Optimize AI model execution on CPU, GPU, DSP, TPU, or custom AI chips (e.g., NPU, FPGA).
• Intermediate Representations (IRs): Work with MLIR, TVM, XLA, Glow, or custom IRs for model transformation.
• Performance Tuning: Profile and analyze models using LLVM, Halide, CUDA, OpenCL, or Metal.
• Kernel Optimization: Develop low-level math libraries (SIMD, vectorized ops, matrix multiplications, tensor ops) for efficient AI inference.
• Custom Operator Support: Implement new AI operators and optimize execution on target hardware.
• Cross-Platform Deployment: Enable model portability across multiple architectures and backends.
• AI/ML Framework Integration: Extend compiler functionality for PyTorch, TensorFlow, ONNX Runtime, and other ML frameworks.
• Debugging & Benchmarking
• Experience: 2+ years in model compilation, AI frameworks, or deep learning accelerators.
• Programming Languages: C, C++, Python, and LLVM IR or MLIR.
• Compiler Development: Experience with LLVM, TVM, XLA, Halide, Glow, or custom ML compilers.
• Graph Transformations: Knowledge of operator fusion, loop unrolling, constant folding, quantization, and tiling techniques.
• Hardware Optimization: Experience with SIMD, CUDA, OpenCL, ROCm, or low-level tensor operations.
• AI Frameworks: Hands-on with TensorFlow, PyTorch, ONNX, TensorRT, TFLite, or OpenVINO.
• Parallel Computing: Experience with multi-threading, vectorization (SSE/AVX), and heterogeneous computing.
주요업무
We are seeking an experienced AI Model Compiler Engineer to develop and optimize model compilation pipelines for deep learning frameworks. You will work on converting AI models (e.g., ONNX, TensorFlow, PyTorch) into efficient, hardware-optimized code for edge and cloud-based AI processors. Your role will involve compiler optimizations, graph transformations, and hardware-specific acceleration techniques.• Model Compilation Pipeline: Design and implement compilers that translate AI models (ONNX, TensorFlow, PyTorch, etc.) into executable formats for AI accelerators and edge devices.
• Graph Optimization: Apply operator fusion, pruning, quantization, and memory optimizations to improve model performance.
• Hardware Acceleration: Optimize AI model execution on CPU, GPU, DSP, TPU, or custom AI chips (e.g., NPU, FPGA).
• Intermediate Representations (IRs): Work with MLIR, TVM, XLA, Glow, or custom IRs for model transformation.
• Performance Tuning: Profile and analyze models using LLVM, Halide, CUDA, OpenCL, or Metal.
• Kernel Optimization: Develop low-level math libraries (SIMD, vectorized ops, matrix multiplications, tensor ops) for efficient AI inference.
• Custom Operator Support: Implement new AI operators and optimize execution on target hardware.
• Cross-Platform Deployment: Enable model portability across multiple architectures and backends.
• AI/ML Framework Integration: Extend compiler functionality for PyTorch, TensorFlow, ONNX Runtime, and other ML frameworks.
• Debugging & Benchmarking
자격요건
• Education: Bachelor's, Master's, or Ph.D. in Computer Science, Electrical Engineering, or related fields.• Experience: 2+ years in model compilation, AI frameworks, or deep learning accelerators.
• Programming Languages: C, C++, Python, and LLVM IR or MLIR.
• Compiler Development: Experience with LLVM, TVM, XLA, Halide, Glow, or custom ML compilers.
• Graph Transformations: Knowledge of operator fusion, loop unrolling, constant folding, quantization, and tiling techniques.
• Hardware Optimization: Experience with SIMD, CUDA, OpenCL, ROCm, or low-level tensor operations.
• AI Frameworks: Hands-on with TensorFlow, PyTorch, ONNX, TensorRT, TFLite, or OpenVINO.
• Parallel Computing: Experience with multi-threading, vectorization (SSE/AVX), and heterogeneous computing.

