Scalable HMM-based Inference Engine in Large Vocabulary Continuous Speech Recogntion

TitleScalable HMM-based Inference Engine in Large Vocabulary Continuous Speech Recogntion
Publication TypeConference Paper
Year of Publication2009
AuthorsChong, J., You K., Yi Y., Gonina E., Hughes C., Sung W., & Keutzer K.
Conference NameIEEE International Conference on Multimedia and Expo

Parallel scalability allows an application to effi-
ciently utilize an increasing number of processing elements. In
this paper we explore a design space for application scalability
for an inference engine in large vocabulary continuous speech
recognition (LVCSR). Our implementation of the inference en-
gine involves a parallel graph traversal through an irregular
graph-based knowledge network with millions of states and
arcs. The challenge is not only to define a software architecture
that exposes sufficient fine-grained application concurrency, but
also to efficiently synchronize between an increasing number of
concurrent tasks and to effectively utilize the parallelism oppor-
tunities in today’s highly parallel processors. We propose four
application-level implementation alternatives we call “algorithm
styles”, and construct highly optimized implementations on two
parallel platforms: an Intel Core i7 multicore processor and a
NVIDIA GTX280 manycore processor. The highest performing
algorithm style varies with the implementation platform. On 44
minutes of speech data set, we demonstrate substantial speedups
of 3.4 × on Core i7 and 10.5× on GTX280 compared to a highly
optimized sequential implementation on Core i7 without sacri-
ficing accuracy. The parallel implementations contain less than
2.5% sequential overhead, promising scalability and significant
potential for further speedup on future platforms.

Speech Recogntion.pdf559.05 KB