Skip to content
§ 01 about

Cyan Subhra Mishra

Cyan Subhra Mishra, head and shoulders portrait
Fig. 0 — author photo · 2024

I'm a Performance and Power Engineer on the SoC Architecture Performance team at Arm in San Diego, where I work on pre-silicon performance and power analysis for next-generation heterogeneous SoCs targeting Edge AI, IoT, and automotive workloads.

I completed my Ph.D. in Computer Science and Engineering at Penn State in 2025, advised by Mahmut Taylan Kandemir and Jack Sampson at the Microsystems Design Lab. My research sits at the intersection of computer architecture and machine learning systems, with a focus on energy-efficient computing for resource-constrained environments — energy-harvesting sensors, computational storage, intermittent computing, and continuous learning at the edge.

Before Penn State I was a Design Engineer at Intel in Bangalore working on FPGA accelerators for ML and bioinformatics, and I did my B.Tech + M.Tech dual degree at NIT Rourkela. I've also interned at Bell Labs (TVM-based inference serving) and earlier at Intel, IIT Bombay, and CAIR India.

Outside of work I think about mentoring, write occasional technical posts, and build tools for SoC performance work in my spare time. Based in San Diego, CA.

§ 02 education
  1. 01 2018 — 2025

    Ph.D., Computer Science and Engineering

    The Pennsylvania State University University Park, PA

    Advised by Mahmut Taylan Kandemir and Jack Sampson at the Microsystems Design Lab. Thesis work on hardware/software co-design for ML systems — energy-harvesting sensors, computational storage, intermittent computing, and continuous learning at the edge.

    Computer ArchitectureML SystemsEnergy HarvestingComputational Storage
  2. 02 2011 — 2016

    B.Tech + M.Tech (dual), Electronics & Communication Engineering

    National Institute of Technology, Rourkela Odisha, India

    Graduated with Honors (CGPA 8.39/10). Advised by Sarat Kumar Patra and (industry mentor) Tarjinder Singh at Intel.

    HonorsFPGAEmbedded Systems
§ 03 experience
  1. 01 2026 · Jan — present

    Performance and Power Engineer · SoC Architecture Performance

    Arm San Diego, CA

    Pre-silicon performance and power analysis for next-generation heterogeneous SoCs targeting Edge AI, IoT, and automotive workloads.

    • Quantify latency, throughput, energy, and area impacts of compute subsystems across hardware and software stacks.
    • Develop performance evaluation and automation frameworks in Python and C++ for large-scale workload profiling, simulation, and regression benchmarking.
    • Characterize workloads with hardware counters and custom telemetry pipelines to identify bottlenecks and optimization opportunities.
    • Integrate performance data into CI workflows for automated regression detection.
  2. 02 2018 · Dec — 2025 · Dec

    Graduate Research Assistant · Microsystems Design Lab

    Penn State University Park, PA

    Designed hardware/software co-design methodologies for ML systems with a focus on performance, energy efficiency, and resource utilization across heterogeneous platforms.

    • Up to 22% higher accuracy on intermittently-powered DNNs with minimal computational overhead (NExUME, ICLR 2025).
    • High-throughput computational storage architectures reducing data movement by 6.1× (Salient Store, PACT 2025).
    • Performance modeling for multi-dimensional optimization of large-scale ML deployments.
  3. 03 2021 · Jun — Aug

    Research Intern

    Bell Labs Murray Hill, NJ

    Optimization strategies for autonomous ML inference serving across heterogeneous platforms (GPUs, FPGAs), leveraging Apache TVM for cross-platform kernel optimization. Implemented pruning, quantization, and knowledge distillation pipelines.

  4. 04 2016 · Jun — 2018 · Jun

    Design Engineer

    Intel Bengaluru, India

    • Hardware/software co-design for ML accelerators (GPU + FPGA), with systematic perf modeling.
    • FPGA-deployable convolution and softmax kernels balancing efficiency and resource utilization.
    • Software/hardware simulation frameworks for accelerator validation and rapid iteration.
  5. 05 2015 · Dec — 2016 · Jun

    Research Intern

    Intel Bengaluru, India

    FPGA-based accelerators for protein search algorithms (pairHMM, HMMer) using OpenCL. Significant speedups over CPU implementations.

  6. 06 2014 · May — Jul

    Research Intern · CSRE

    IIT Bombay Mumbai, India

    Computational models for pre-processing hyperspectral satellite imagery — feature identification and data extraction algorithms.

  7. 07 2013 · May — Jul

    Research Intern

    Centre for Artificial Intelligence and Robotics (CAIR) Bengaluru, India

    Generic method for azimuthal map projection and coordinate transformation — published in Defence Science Journal (2015).

§ 04 research interests
Computer Architecture for ML Sustainable Computing Edge & Energy Harvesting Intermittent Computing Hardware/Software Co-design Continuous Learning MoE Inference Computational Storage Performance Modeling
§ 05 elsewhere

Find me