肝郁脾虚吃什么药| 地铁什么时候停运| 枸杞子和什么泡水喝补肾壮阳| 化疗后骨髓抑制是什么意思| 肌酐什么意思| 尿频尿急吃什么药比较好| touch什么意思| 玉米须能治什么病| 给小孩办身份证需要什么| 肾结石有什么症状表现| 什么物流寄大件便宜| 咳嗽有痰挂什么科| 胆结石不能吃什么食物| 吃葡萄干有什么好处| 红豆不能和什么一起吃| 血糖高早餐吃什么好| 用什么泡水喝对肝脏好| 冲猴煞北是什么意思| 早上眼屎多是什么原因| 凝胶是什么东西| 为什么要长征| 膝超伸是什么| 人工肝是什么意思| lxy是什么意思| 股长是什么级别| 虬角为什么要染成绿色| 姜红枣红糖一起煮有什么效果| 3.1415926是什么意思| 子宫内膜厚有什么危害| 台风是什么| 试管婴儿长方案是什么| 毛囊炎什么症状| 干咳是什么原因引起的| 早期胃癌有什么症状| 碘酸钾是什么| 尿蛋白十一是什么意思| 没有宇宙之前是什么| 金针菇炒什么好吃| 早晨5点是什么时辰| 鸡飞狗跳是什么生肖| 胎儿打嗝是什么原因| 牛排用什么油煎好吃| 无事不登三宝殿什么意思| 质子是什么意思| 出脚汗是什么原因| pe和pb是什么意思| 梦见自己死了又活了是什么意思| 碧玺是什么材质| hpv初期有什么症状女性| 常吃黑芝麻有什么好处和坏处| 手心脚心发热是什么原因引起的| 为什么手会不自觉的抖| 利郎男装是什么档次的| 体检尿常规查什么| 房产税什么时候开始征收| 洁字五行属什么| 孕妇什么东西不能吃| 脑供血不足吃什么中药| 肽是什么东西| 返图是什么意思| 长期吃避孕药有什么危害| 吃饭后胃胀是什么原因| 瘰疬是什么病| 紫米是什么米| 榴莲有什么品种| 宋江是一个什么样的人| 云母是什么东西| 醋泡姜用什么醋好| 为什么会起水泡| 反流性食管炎吃什么中药| 镶牙和种牙有什么区别| 蛋白粉和胶原蛋白粉有什么区别| 包皮炎是什么症状| 什么的嗓门| 缺钾最忌讳吃什么| 女性喝什么茶最好| 什么菜可以隔夜吃| 婴儿第一次理发有什么讲究吗| 乐色是什么意思| 还俗是什么意思| 下聘是什么意思| 小孩血糖高是什么原因引起的| 掉头发去医院看什么科| 咖啡色五行属什么| 手刃是什么意思| 技压群雄的意思是什么| 肾小球滤过率偏高说明什么| 什么是宫腔镜手术| 猫屎为什么那么臭| 下午三点到四点是什么时辰| 2月3日什么星座| 肺动脉增宽是什么意思| 雷同是什么意思| 小孩为什么会得手足口病| cbs是什么意思| 坐久了脚肿是什么原因| 同仁什么意思| 吃完饭就犯困是什么原因| 8000年前是什么朝代| 牙齿为什么会变黑| 唏嘘不已的意思是什么| 用劲的近义词是什么| 伤官什么意思| 肚子上方中间疼是什么部位| 保护声带喝什么| 泳字五行属什么| 什么是邪教| 12.6是什么星座| 长脚气是什么原因引起的| 人工牛黄是什么| 电磁波是什么| 血小板是什么意思| 高脂血症吃什么药| 微笑是什么意思| 元胡是什么| 尿频挂什么科| 艾草有什么作用| 桑叶长什么样子图片| 大好河山是什么生肖| 肝占位病变是什么意思| 农历今天属什么| 南昌有什么好吃的| 番薯什么时候传入中国| 哺乳期感冒能吃什么药| 拂尘是什么意思| 低钠盐是什么意思| 小孩掉头发是什么原因| 子宫肌瘤有什么症状表现| 人生只剩归途什么意思| 梦见跟妈妈吵架是什么意思| 菊花什么时候开放| 养殖业什么最赚钱农村| us是什么单位| 阑尾在人体的什么位置| 无缘无故头疼是什么原因| 色带是什么| 来月经腰疼是什么原因| 孕早期可以吃什么水果| mr检查是什么| 盆腔钙化灶是什么意思| 右眼皮跳是什么预兆男| 人为什么会抑郁| 结核有什么症状| 食指是什么经络| 煞笔是什么意思| 人乳头瘤病毒16型阳性是什么意思| 外包什么意思| 台州为什么念第一声| 腰疼吃什么| 小姑子是什么关系| 风寒感冒吃什么药最快| 骨龄是什么意思| 裤子前浪后浪是什么| 空调为什么不制冷| 流产吃什么药可以堕胎| 什么品牌的空气炸锅好| 为什么会有鼻炎| 捆绑是什么意思| 骨折是什么意思| premier是什么牌子| 芥末是什么味道| 百依百顺是什么生肖| 什么是超话| 八面玲珑是指什么生肖| 梦到孩子死了是什么征兆| 体检需要注意什么| 梦到前夫什么意思| 吃布洛芬有什么副作用| ifound是什么牌子| 肠炎发烧吃什么药| 泉中水是什么生肖| 什么是化石| 咳嗽去医院挂什么科| 为什么会得干眼症| 老觉得饿是什么原因| 一什么柜子| 午睡后头疼是什么原因| 婚检能检查出什么| 七星伴月是什么意思| pks是什么意思| 桃花是什么季节开的| 晏字五行属什么的| 前方高能是什么意思| 风口浪尖是什么意思| 孕妇佩戴什么保胎辟邪| 督导是什么| palladium是什么牌子| 做梦梦见剪头发是什么意思| 为什么老是便秘| 荔枝长什么样| 柔式按摩是什么| 双相情感障碍是什么病| 喝什么茶清肺效果最好| 小麦肤色是什么颜色| 白俄罗斯和俄罗斯有什么区别| 临床医学是什么意思| 什么花代表永恒的爱| 碳酸钠呈什么性| 老睡不着觉是什么原因| 心理医生挂什么科| 接触性皮炎用什么药膏| 内膜欠均匀是什么意思| 酒精对皮肤有什么伤害| 杂合突变型是什么意思| 阑尾炎是什么原因引起的| 人体电解质是什么| 去黄疸吃什么药| 移徒什么意思| 子宫肌瘤是什么原因导致的| 宫保鸡丁属于什么菜系| 什么蓝| 佛舍利到底是什么| 淋巴结是什么原因引起的| 黄皮果是什么水果| 痛风吃什么药止痛最快| 鹅蛋脸适合什么样的发型| 破月什么意思| 吃青提有什么好处| 新发展理念是什么| 10个月的宝宝吃什么辅食最好| 快乐大本营为什么停播| 右上腹是什么器官| 冰粉为什么要加石灰水| 甘油三酯高是什么原因引起的| 斯字五行属什么| 青霉素是什么药| 茉莉花茶适合什么人喝| 阴囊长白毛是什么原因| skechers是什么牌子| 孕妇现在吃什么水果好| 屁股长痣代表什么| 喝酒精的后果是什么| 初秋的天冰冷的夜是什么歌| 胆结石吃什么排石最快| 辅警和协警有什么区别| 六月初三是什么星座| 今天突然拉稀拉出血什么原因| 眼睛突然出血是什么原因导致| 蛋白粉什么时候吃效果最好| 肝喜欢什么食物有哪些| 复方北豆根氨酚那敏片是什么药| 胸口疼痛是什么原因| 腹泻吃什么药好| 鸭胗是什么器官| 刷牙牙龈出血是什么原因| 免疫力低会引起什么病| 激素六项主要是查什么| 女性潮红是什么意思| 85年属牛是什么命| 梦见被猪咬是什么意思| 平头哥是什么意思| 孩子不长个子是什么原因| 水烧开后有白色沉淀物是什么| 瓦斯是什么| 八纲辨证中的八纲是什么| 子宫肌瘤伴钙化是什么意思| 发烧骨头疼是什么原因| 为什么长不胖一直很瘦| penis是什么意思| ai是什么元素| 仌是什么字| 什么什么的落叶| 头昏脑涨是什么原因| 什么的目光| 百度Jump to content

聊城刻书与出版业的兴衰概况

From Wikipedia, the free encyclopedia
(Redirected from Speech encoding)
百度   当下的中国,正在经历着一场社会大变革,人们正在从小农社会迈向“信息时代”。

Speech coding is an application of data compression to digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic data compression algorithms to represent the resulting modeled parameters in a compact bitstream.[1]

Common applications of speech coding are mobile telephony and voice over IP (VoIP).[2] The most widely used speech coding technique in mobile telephony is linear predictive coding (LPC), while the most widely used in VoIP applications are the LPC and modified discrete cosine transform (MDCT) techniques.[citation needed]

The techniques employed in speech coding are similar to those used in audio data compression and audio coding where appreciation of psychoacoustics is used to transmit only data that is relevant to the human auditory system. For example, in voiceband speech coding, only information in the frequency band 400 to 3500 Hz is transmitted but the reconstructed signal retains adequate intelligibility.

Speech coding differs from other forms of audio coding in that speech is a simpler signal than other audio signals, and statistical information is available about the properties of speech. As a result, some auditory information that is relevant in general audio coding can be unnecessary in the speech coding context. Speech coding stresses the preservation of intelligibility and pleasantness of speech while using a constrained amount of transmitted data.[3] In addition, most speech applications require low coding delay, as latency interferes with speech interaction.[4]

Categories

[edit]

Speech coders are of two classes:[5]

  1. Waveform coders
  2. Vocoders

Sample companding viewed as a form of speech coding

[edit]

The A-law and μ-law algorithms used in G.711 PCM digital telephony can be seen as an earlier precursor of speech encoding, requiring only 8 bits per sample but giving effectively 12 bits of resolution.[7] Logarithmic companding are consistent with human hearing perception in that a low-amplitude noise is heard along a low-amplitude speech signal but is masked by a high-amplitude one. Although this would generate unacceptable distortion in a music signal, the peaky nature of speech waveforms, combined with the simple frequency structure of speech as a periodic waveform having a single fundamental frequency with occasional added noise bursts, make these very simple instantaneous compression algorithms acceptable for speech.[citation needed][dubiousdiscuss]

A wide variety of other algorithms were tried at the time, mostly delta modulation variants, but after careful consideration, the A-law/μ-law algorithms were chosen by the designers of the early digital telephony systems. At the time of their design, their 33% bandwidth reduction for a very low complexity made an excellent engineering compromise. Their audio performance remains acceptable, and there was no need to replace them in the stationary phone network.[citation needed]

In 2008, G.711.1 codec, which has a scalable structure, was standardized by ITU-T. The input sampling rate is 16 kHz.[8]

Modern speech compression

[edit]

Much of the later work in speech compression was motivated by military research into digital communications for secure military radios, where very low data rates were used to achieve effective operation in a hostile radio environment. At the same time, far more processing power was available, in the form of VLSI circuits, than was available for earlier compression techniques. As a result, modern speech compression algorithms could use far more complex techniques than were available in the 1960s to achieve far higher compression ratios.

The most widely used speech coding algorithms are based on linear predictive coding (LPC).[9] In particular, the most common speech coding scheme is the LPC-based code-excited linear prediction (CELP) coding, which is used for example in the GSM standard. In CELP, the modeling is divided in two stages, a linear predictive stage that models the spectral envelope and a code-book-based model of the residual of the linear predictive model. In CELP, linear prediction coefficients (LPC) are computed and quantized, usually as line spectral pairs (LSPs). In addition to the actual speech coding of the signal, it is often necessary to use channel coding for transmission, to avoid losses due to transmission errors. In order to get the best overall coding results, speech coding and channel coding methods are chosen in pairs, with the more important bits in the speech data stream protected by more robust channel coding.

The modified discrete cosine transform (MDCT) is used in the LD-MDCT technique used by the AAC-LD format introduced in 1999.[10] MDCT has since been widely adopted in voice-over-IP (VoIP) applications, such as the G.729.1 wideband audio codec introduced in 2006,[11] Apple's FaceTime (using AAC-LD) introduced in 2010,[12] and the CELT codec introduced in 2011.[13]

Opus is a free software audio coder. It combines the speech-oriented LPC-based SILK algorithm and the lower-latency MDCT-based CELT algorithm, switching between or combining them as needed for maximal efficiency.[14][15] It is widely used for VoIP calls in WhatsApp.[16][17][18] The PlayStation 4 video game console also uses Opus for its PlayStation Network system party chat.[19]

A number of codecs with even lower bit rates have been demonstrated. Codec2, which operates at bit rates as low as 450 bit/s, sees use in amateur radio.[20] NATO currently uses MELPe, offering intelligible speech at 600 bit/s and below.[21] Neural vocoder approaches have also emerged: Lyra by Google gives an "almost eerie" quality at 3 kbit/s.[22] Microsoft's Satin also uses machine learning, but uses a higher tunable bitrate and is wideband.[23]

Sub-fields

[edit]
Wideband audio coding
Narrowband audio coding

See also

[edit]

References

[edit]
  1. ^ Arjona Ramírez, M.; Minam, M. (2003). "Low bit rate speech coding". Wiley Encyclopedia of Telecommunications, J. G. Proakis, Ed. 3. New York: Wiley: 1299–1308.
  2. ^ M. Arjona Ramírez and M. Minami, "Technology and standards for low-bit-rate vocoding methods," in The Handbook of Computer Networks, H. Bidgoli, Ed., New York: Wiley, 2011, vol. 2, pp. 447–467.
  3. ^ P. Kroon, "Evaluation of speech coders," in Speech Coding and Synthesis, W. Bastiaan Kleijn and K. K. Paliwal, Ed., Amsterdam: Elsevier Science, 1995, pp. 467-494.
  4. ^ J. H. Chen, R. V. Cox, Y.-C. Lin, N. S. Jayant, and M. J. Melchner, A low-delay CELP coder for the CCITT 16 kb/s speech coding standard. IEEE J. Select. Areas Commun. 10(5): 830-849, June 1992.
  5. ^ "Soo Hyun Bae, ECE 8873 Data Compression & Modeling, Georgia Institute of Technology, 2004". Archived from the original on 7 September 2006.
  6. ^ Zeghidour, Neil; Luebs, Alejandro; Omran, Ahmed; Skoglund, Jan; Tagliasacchi, Marco (2022). "SoundStream: An End-to-End Neural Audio Codec". IEEE/ACM Transactions on Audio, Speech, and Language Processing. 30: 495–507. arXiv:2107.03312. doi:10.1109/TASLP.2021.3129994. S2CID 236149944.
  7. ^ Jayant, N. S.; Noll, P. (1984). Digital coding of waveforms. Englewood Cliffs: Prentice-Hall.
  8. ^ G.711.1 : Wideband embedded extension for G.711 pulse code modulation, ITU-T, 2012, retrieved 2025-08-07
  9. ^ Gupta, Shipra (May 2016). "Application of MFCC in Text Independent Speaker Recognition" (PDF). International Journal of Advanced Research in Computer Science and Software Engineering. 6 (5): 805–810 (806). ISSN 2277-128X. S2CID 212485331. Archived from the original (PDF) on 2025-08-07. Retrieved 18 October 2019.
  10. ^ Schnell, Markus; Schmidt, Markus; Jander, Manuel; Albert, Tobias; Geiger, Ralf; Ruoppila, Vesa; Ekstrand, Per; Bernhard, Grill (October 2008). MPEG-4 Enhanced Low Delay AAC - A New Standard for High Quality Communication (PDF). 125th AES Convention. Fraunhofer IIS. Audio Engineering Society. Retrieved 20 October 2019.
  11. ^ Nagireddi, Sivannarayana (2008). VoIP Voice and Fax Signal Processing. John Wiley & Sons. p. 69. ISBN 9780470377864.
  12. ^ Daniel Eran Dilger (June 8, 2010). "Inside iPhone 4: FaceTime video calling". AppleInsider. Retrieved June 9, 2010.
  13. ^ Presentation of the CELT codec Archived 2025-08-07 at the Wayback Machine by Timothy B. Terriberry (65 minutes of video, see also presentation slides in PDF)
  14. ^ "Opus Codec". Opus (Home page). Xiph.org Foundation. Retrieved July 31, 2012.
  15. ^ Valin, Jean-Marc; Maxwell, Gregory; Terriberry, Timothy B.; Vos, Koen (October 2013). High-Quality, Low-Delay Music Coding in the Opus Codec. 135th AES Convention. Audio Engineering Society. arXiv:1602.04845.
  16. ^ Leyden, John (27 October 2015). "WhatsApp laid bare: Info-sucking app's innards probed". The Register. Retrieved 19 October 2019.
  17. ^ Hazra, Sudip; Mateti, Prabhaker (September 13–16, 2017). "Challenges in Android Forensics". In Thampi, Sabu M.; Pérez, Gregorio Martínez; Westphall, Carlos Becker; Hu, Jiankun; Fan, Chun I.; Mármol, Félix Gómez (eds.). Security in Computing and Communications: 5th International Symposium, SSCC 2017. Springer. pp. 286–299 (290). doi:10.1007/978-981-10-6898-0_24. ISBN 9789811068980.
  18. ^ Srivastava, Saurabh Ranjan; Dube, Sachin; Shrivastaya, Gulshan; Sharma, Kavita (2019). "Smartphone Triggered Security Challenges: Issues, Case Studies and Prevention". In Le, Dac-Nhuong; Kumar, Raghvendra; Mishra, Brojo Kishore; Chatterjee, Jyotir Moy; Khari, Manju (eds.). Cyber Security in Parallel and Distributed Computing: Concepts, Techniques, Applications and Case Studies. John Wiley & Sons. pp. 187–206 (200). doi:10.1002/9781119488330.ch12. ISBN 9781119488057. S2CID 214034702.
  19. ^ "Open Source Software used in PlayStation4". Sony Interactive Entertainment Inc. Retrieved 2025-08-07.[failed verification]
  20. ^ "GitHub - Codec2". GitHub. November 2019.
  21. ^ Alan McCree, “A scalable phonetic vocoder framework using joint predictive vector quantization of MELP parameters,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, 2006, pp. I 705–708, Toulouse, France
  22. ^ Buckley, Ian (2025-08-07). "Google Makes Its Lyra Low Bitrate Speech Codec Public". MakeUseOf. Retrieved 2025-08-07.
  23. ^ Levent-Levi, Tsahi (2025-08-07). "Lyra, Satin and the future of voice codecs in WebRTC". BlogGeek.me. Retrieved 2025-08-07.
  24. ^ "LPCNet: Efficient neural speech synthesis". Xiph.Org Foundation. 8 August 2023.
[edit]
喝碱性水有什么好处 九岁属什么生肖 50元人民币什么时候发行的 万条垂下绿丝绦的上一句是什么 双非是什么
什么是提示语 两肺纹理增多模糊是什么意思 化疗后白细胞低吃什么补得快 翳是什么意思 农历8月15是什么节日
梦见铲雪预示着什么 阿昔洛韦片治什么病 明天有什么考试 反将一军什么意思 沙拉是什么意思
ng是什么意思 什么的亮光 sey什么意思 池鱼是什么意思 恢复伤口吃什么好得快
献血后吃什么aiwuzhiyu.com 水稻什么时候播种hcv7jop7ns2r.cn 柠檬什么时候成熟hcv7jop9ns9r.cn 降血脂吃什么hcv7jop4ns6r.cn 普字五行属什么hcv7jop6ns6r.cn
dream car是什么意思hcv8jop2ns5r.cn 不劳而获是什么意思hcv8jop4ns1r.cn 有机食品是什么意思hcv7jop6ns0r.cn 什么是种草hcv8jop3ns0r.cn 把碗打碎了有什么征兆hcv9jop7ns5r.cn
沸去掉三点水念什么hcv9jop0ns6r.cn 冬是什么生肖hcv8jop8ns9r.cn 的确良是什么面料hcv9jop1ns1r.cn 乙肝弱阳性是什么意思xinmaowt.com 倍他乐克是什么药naasee.com
低回声斑块是什么意思sanhestory.com 子午是什么时间hcv8jop0ns8r.cn 男人结扎了有什么坏处hcv8jop0ns1r.cn cr是什么意思hcv8jop6ns2r.cn 1.1是什么星座hcv9jop8ns2r.cn
百度