Pemodelan Lip Reading Bahasa Indonesia Berbasis Visem Menggunakan VGG16 serta Jaro-Winkler Similarity dan Bigram
(1) Program Studi Teknik Informatika, Universitas Kristen Petra Surabaya
(2) Program Studi Teknik Informatika, Universitas Kristen Petra Surabaya
(3) Program Studi Teknik Informatika, Universitas Kristen Petra Surabaya
(*) Corresponding Author
Abstract
Keywords
Full Text:
PDFReferences
Ansari, M. et al. (2017). A comprehensive analysis of image
edge detection techniques. International Journal of
Multimedia and Ubiquitous Engineering, 12(11), 1-12. DOI:
14257/ijmue.2017.12.11.01.
Archana, J. N., & Aishwarya, P. (2016). A review on the image
sharpening algorithms using unsharp masking. International
Journal of Engi-neering Science and Computing, 6(7). DOI:
35940/ijitee.K2091.1081219.
Arifin, F. et al. (2015). Lip reading based on background
subtraction and image projection. 2015 International
Conference on Information Technology Systems and
Innovation (ICITSI), 1-3. DOI: 10.1109/ICITSI.2015.
Aulia, M. et al. (2017). Sentence-level Indonesian lip reading
with Spatiotemporal CNN and Gated RNN. DOI:
1109/ICACSIS.2017.8355061.
Cho, K. et al (2014). Learning phrase representations using
RNN encoder-decoder for statistical machine translation.
DOI: 10.3115/v1/D14-1179.
Dell'Aringa, A. H. B. et al. (2007). Lip reading role in the
hearing aid fitting process. Brazilian Journal of
Otorhinolaryngology, Volume 73, Issue 1, 95-99. ISSN 1808-
DOI: 10.1016/S1808-8694(15)31129-0.
Estellers, V. & Thiran, J. (2012). Multi-pose lipreading and
audio-visual speech recognition. EURASIP Journal on
Advances in Signal Processing, 2012(1), 1-23. DOI:
1186/1687-6180- 2012-51.
Garg, A. et al. (2016). Lip reading using CNN and LSTM.
Retrieved from http://cs231n.stanford.edu/reports/2016/pdfs/
_Report.pdf.
Grossi, E. & Buscema, M. (2008). Introduction to artificial
neural networks. European Journal of Gastroenterology &
Hepatology, 19(12), 1046-54. DOI: 10.1097/MEG.
b013e3282f198a0.
Harpini, A. (2019). Disabilitas rungu di Indonesia, p. 3. ISSN:
-7659.
Ji, S. et al. (2010). 3D Convolutional Neural Networks for
human action recognition. Pattern Analysis and Machine
Intelligence, 35(1), 495-502. DOI: 10.1109/TPAMI.2012.59.
Kanan, C & Cottrell, G. W. (2012). Color-to-grayscale: does
the method matter in image recognition?. PloS one, 7(1),
e29740. DOI: 10.1371/journal.pone.0029740.
Klakow, D., & Peters, J. (2002). Testing the correlation of
word error rate and perplexity. Speech Communication, 38(1-
, 19-28. DOI: 10.1016/S0167-6393(01)00041-3.
Kurniawan, A. & Suyanto, S. (2020). Syllable-based
Indonesian lip reading model. DOI: 10.1109/
ICoICT49345.2020.9166217.
Leonardo, B., & Hansun, S. (2017). Text documents
plagiarism detection using Rabin-Karp and Jaro-Winkler
distance algorithms. Indonesian Journal of Electrical
Engineering and Computer Science, 5(2), 462-471. DOI:
11591/ijeecs.v5.i2.pp462-471.
Lipton, Z. C. et al. (2015). A critical review of recurrent
neural networks for sequence learning. arXiv preprint
arXiv:1506.00019. DOI: 10.48550/arXiv.1506.00019.
Lu, Y. & Li, H. (2019). Automatic lip-reading system based
on deep Convolutional Neural Network and attention-based
Long Short-Term Memory. Applied Sciences, 9(8), 1599.
DOI: 10.3390/app9081599.
Lynn, H. et al. (2019). A deep Bidirectional GRU Network
model for biometric electrocardiogram classification based
on Recurrent Neural Networks. IEEE Access, 7, 145395-
DOI: 10.1109/ACCESS.2019.2939947.
Martin, S. et al. (1998). Algorithms for bigram and trigram
word clustering. Speech communication, 24(1), 19-37. DOI:
1016/S0167-6393(97)00062-9.
Murthy, N. & Rudregowda, S. (2020). Lip-reading
techniques: A review. International journal of scientific &
technology research, 9(02), 4378-4383. ISSN: 2277-8616.
Nasuha, A. et al. (2017). Automatic lip reading for daily
Indonesian words based on frame difference and horizontalvertical image projection, 95(2), 393-402. ISSN: 1992-8645.
O'Shea, K., & Nash, R. (2015). An introduction to
convolutional neural networks. arXiv preprint
arXiv:1511.08458. DOI: 10.48550/arXiv.1511.08458.
Özcan, T. & Basturk, A. (2019). Lip reading using
Convolutional Neural Networks with and without pre-trained
models. Balkan Journal of Electrical and Computer
Engineering, 7(2), 195-201. DOI: 10.17694/bajece.479891.
Paskin, M. (2004). Grammatical bigrams. Advances in Neural
Information Processing Systems, 14. DOI: 10.1.1.24.8418
Setyati, E. et al. (2015). Phoneme-Viseme mapping for
Indonesian language based on blend shape animation.
IAENG International Journal of Computer Science, 42(3), 1-
DOI: 10.22146/ijitee.47577.
Sherstinsky, A. (2020). Fundamentals of Recurrent Neural
Network (RNN) and Long Short-Term Memory (LSTM)
network. Physica D: Nonlinear Phenomena, 404, 132306.
DOI: 10.1016/j.physd.2019.132306.
Stehman, S. V. (1997). Selecting and interpreting measures of
thematic classification accuracy. Remote sensing of
Environment, 62(1), 77-89. DOI: 10.1016/S0034-
(97)00083-7.
Yuheng, S., & Hao, Y. (2017). Image segmentation
algorithms overview. arXiv preprint arXiv:1707.02051. DOI:
48550/arXiv.1707.02051.
Zhu, C., & Gao, D. (2016). Influence of data preprocessing.
Journal of Computing Science and Engineering, 10(2), 51-57.
DOI: 10.5626/JCSE.2016.10.2.51.
Zisserman, A. (2014). Very deep convolutional networks for
large-scale image recognition. arXiv: 1409.1556. DOI:
48550/arXiv.1409.1556.
Refbacks
- There are currently no refbacks.
Jurnal telah terindeks oleh :