Microsoft AI Research Introduces DeepSpeed-MII, A New Open-Source Python Library From DeepSpeed That Speeds Up 20,000+ Widely Used Deep Learning Models - MarkTechPost
While open-source software has made AI accessible to more people, there are still two significant barriers to its widespread use: inference delay and cost. System optimizations have come a long way and can substantially reduce latency and cost for DL model inference, but they are not immediately accessible. Many data scientists lack the expertise to correctly identify and implement the set of system optimizations relevant to a specific model, making low latency and low-cost inference primarily out of reach. The complex nature of the DL model inference landscape, including wide variations in model size, architecture, system performance characteristics, hardware requirements,