Optimal 3D Sound Source Localization using a Numerical Method

Please use this identifier to cite or link to this item: http://ithesis-ir.su.ac.th/dspace/handle/123456789/6091

Title:	Optimal 3D Sound Source Localization using a Numerical Method การหาตำแหน่งแหล่งกำเนิดเสียงในพิกัด 3 มิติที่เหมาะสมที่สุดโดยใช้ระเบียบวิธีเชิงตัวเลข
Authors:	Chaowapat CHANITKITJAROENPORN เชาวพัฒน์ ชนิดกิจเจริญพร CHUKIET SODSRI ชูเกียรติ สอดศรี Silpakorn University CHUKIET SODSRI ชูเกียรติ สอดศรี sodsri_c@su.ac.th sodsri_c@su.ac.th
Keywords:	การระบุตำแหน่งแหล่งกำเนิดเสียง ความต่างเวลาที่เสียงมาถึง ระเบียบวิธีเชิงตัวเลข วิธีจุดภายใน Sound source localization Time difference of arrival Numerical method Interior point method
Issue Date:	4
Publisher:	Silpakorn University
Abstract:	This work presents a method for sound source localization in 3- dimensional coordinates using less than 4 microphones and numerical approach to find an optimal position of a sound source. The study started with using 3 microphones in a symmetric arrangement to locate a 2-dimensional (2D) sound source positions in the first quadrant. Two approaches; one of using the intersection of direct traveling sound wave paths as the sound source position and second of applying numerical solution to an objective function, were used and compared. In the first approach, far field sound source locations were assumed and time difference of arrivals (TDOAs) to pairs of microphones, obtained by utilizing generalized cross correlation phase transform (GCC-PHAT), were used to derive linear lines of the direct traveling waves from the sound source to the center of microphone pairs. The intersections of these direct lines were averaged and accepted as the sound source position. For the second approach, TDOAs together with Euclidean distances were employed to created an objective function and minimization of the function with some constraints was performed, by using interior point method to yield an optimal sound source position. Experiments were done to locate 12 sound source positions reside in the first quadrant and the same plane as microphones. Results displayed that both direct-path intersection and numerical approaches were able to accurately locate the source positions in far field with minimal averaged errors of 1.53% and 1.01%, respectively. However, at near field the numerical approach yielded higher accuracy with 1.58% error average, compared to 4.04% averaged error from the direct line intersection approach. This was due to that the numerical approach was proposed and applied without a need of far field assumption, then it able to locate both near and far field positions. Study of 2D sound localizations in 4 quadrants with 4 microphones in the same plane, and with both symmetric and asymmetric arrangements were also done. Twenty sound source positions in both near and far fields, all 4 quadrants, and 0.3 m, 0.6 m, 0.9 m, and 1.2 m away from microphone plane, were used in experiments. It was found that, for locating sound sources not in the microphones plane, the further away the sources from the plane, the more averaged errors both direct path intersection and numerical approaches yielded; i.e. resulted averaged errors for 0.3 m, 0.6 m, 0.9 m, 1.2 m sound source positions away from the microphones plane were 13.17%, 36.46%, 45.68%, 59.85% and 9.40%, 36.21%, 45.54%, 56.85% when the direct path intersection and numerical approaches were used with symmetric microphone arrangement, respectively. However, considered where the sound source located near the microphone plane (i.e. 0.3 m away from the plane) at 45 degree of x-y plane, asymmetric arrangement yielded much higher accuracy with 6.79% localization error compared to that of 47.51% and 32.29% from the direct path intersection and numerical approach with symmetric microphone arrangement. Since asymmetric arrangement allowed distinct TDOAs for microphone pairs, with integration of numerical approach it may permitted capability of localization of sound sources in all position both near and far fields. A new approach for 3D sound source localization was then proposed by using 4 microphones arranged in nonsymmetric geometry and not in the same plane, and applying numerical approach to minimize objective function via interior point method utilization. The objective function was created from measured TDOAs with Euclidean distance in 3D space. For a purpose of efficiency comparison, the direct path intersection approach was also further developed for 3D sound source localization, in which the intersection of 3 cones’ surfaces was considered as the sound source location. Twenty impulsive sound sources in near and far fields, all quadrant, and some at 45 degree from origin were used for efficiency test. Experiment results showed that the direct path or cone surface intersection approach yielded very limited accuracy. Only few positions in far field and near a particular axis could be accurately located. In contrast, the purpose approach permitted very accuracy localization of all positions in all quadrant, near and far fields with minimal average error of 1.24%. งานวิจัยฉบับนี้นำเสนอวิธีการระบุตำแหน่งแหล่งกำเนิดเสียงในพิกัด 3 มิติโดยใช้จำนวนไมโครโฟนไม่เกิน 4 ตัว และใช้ระเบียบวิธีเชิงตัวเลขสำหรับหาตำแหน่งแหล่งกำเนิดเสียงที่เหมาะสมที่สุด การศึกษาเริ่มจากใช้ไมโครโฟน 3 ตัวจัดเรียงแบบสมมาตรหาตำแหน่งแหล่งกำเนิดเสียงใน 2 มิติ ควอดแดรนท์ที่ 1 ด้วย 2 วิธี ได้แก่ วิธีการหาจุดตัดของเส้นทางการเคลื่อนที่ทางตรงของเสียง และวิธีการหาคำตอบเชิงตัวเลขร่วมกับฟังก์ชันวัตถุประสงค์ (Objective function) ในวิธีการแรกจะใช้การสมมุติให้แหล่งกำเนิดเสียงอยู่ในระยะ Far field และนำค่าความต่างเวลาที่เสียงมาถึงไมโครโฟน (Time difference of arrival, TDOA) แต่ละคู่ที่ได้จากวิธีการแปลงเฟสสหสัมพันธ์ไขว้ชนิดเจเนอรัล (Generalized cross correlation phase transform, GCC-PHAT) มาหามุมตกกระทบของเส้นทางการเคลื่อนที่ทางตรงของเสียงจากแหล่งกำเนิดเสียงถึงจุดกึ่งกลางของคู่ไมโครโฟน ค่าเฉลี่ยจุดตัดกันของเส้นทางเดินเสียงจะเป็นตำแหน่งของแหล่งกำเนิดเสียง สำหรับวิธีที่สองจะนำค่า TDOA และสมการระยะทางแบบยูคลิด (Euclidean distance) มาสร้าง Objective function และหาค่าระยะห่างของแหล่งกำเนิดเสียงที่เหมาะสม โดยเป็นค่าที่อยู่ภายในเงื่อนไขขอบเขตที่ทำให้ Objective function มีค่าต่ำสุดด้วยวิธีการจุดภายใน (Interior point method) ทำการทดลองหาตำแหน่งแหล่งกำเนิดเสียงแบบอิมพัลส์ 12 ตำแหน่งในควอดแดรนท์ที่ 1 และอยู่ระนาบเดียวกับไมโครโฟน พบว่าทั้งวิธีการหาจุดตัดของเส้นทางการเคลื่อนที่ทางตรงของเสียง และวิธีการหาคำตอบเชิงตัวเลขสามารถระบุตำแหน่งแหล่งกำเนิดเสียงที่อยู่ในระยะ Far field ได้แม่นยำ โดยมีเปอร์เซ็นต์ความคลาดเคลื่อนเฉลี่ยเท่ากับ 1.53% และ 1.01% ตามลำดับ อย่างไรก็ตามที่ระยะ Near field วิธีหาคำตอบเชิงตัวเลขให้ความแม่นยำสูงกว่าโดยมีค่าเฉลี่ยความคลาดเคลื่อน 1.58% เมื่อเทียบกับความคลาดเคลื่อนเฉลี่ย 4.04% ที่ได้จากวิธีการหาจุดตัดของเส้นทางเดินเสียง ทั้งนี้เนื่องจากวิธีหาคำตอบเชิงตัวเลขไม่ได้มีการสมมุติให้แหล่งกำเนิดเสียงอยู่ในระยะ Far field จึงสามารถหาตำแหน่งของแหล่งกำเนิดเสียงทั้ง Near field และ Far field ได้ การศึกษาการหาตำแหน่งแหล่งในพิกัด 2 มิติ 4 ควอดแดรนท์กระทำโดยใช้การจัดเรียงไมโครโฟน 4 ตัววางอยู่บนระนาบเดียวกันทั้งแบบสมมาตร และอสมมาตร ตำแหน่งแหล่งกำเนิดเสียงแบบอิมพัลส์ 20 ตำแหน่งทั้ง 4 ควอดแดรนท์ในบริเวณ Near field และ Far field ที่อยู่ห่างจากระนาบของไมโครโฟนถูกนำมาใช้ในการทดลอง พบว่าการหาตำแหน่งแหล่งกำเนิดเสียงที่อยู่ห่างจากระนาบของไมโครโฟนจะมีเปอร์เซ็นต์ผิดพลาดสูงตามระยะห่างที่มากขึ้นทั้งวิธีการหาจุดตัดของเส้นทางเดินเสียง และวิธีการหาคำตอบเชิงตัวเลข ซึ่งตำแหน่งที่ห่างจากระนาบของไมโครโฟนเท่ากับ 0.3 เมตร 0.6 เมตร 0.9 เมตร และ 1.2 เมตร มีเปอร์เซ็นต์ความคลาดเคลื่อนเฉลี่ยอยู่ที่ 13.17% 36.46% 45.68% 59.85% และ 9.40% 36.21% 45.54% 56.85% เมื่อใช้วิธีการหาจุดตัดของเส้นทางเดินเสียง และวิธีการหาคำตอบเชิงตัวเลขร่วมกับการจัดเรียงไมโครโฟนแบบสมมาตร ตามลำดับ อย่างไรก็ตามเมื่อพิจารณาแหล่งกำเนิดเสียงที่อยู่ใกล้กับระนาบของไมโครโฟน (อยู่ห่าง 0.3 เมตร จากระนาบไมโครโฟน) ที่ตำแหน่ง 45 องศาจากจุดกำเนิด หรืออยู่ในแนวกึ่งกลางในแต่ละควอดแดรนท์ การจัดเรียงไมโครโฟนแบบอสมมาตรให้ความแม่นยำสูงกว่ามากโดยมีเปอร์เซ็นต์ความคลาดเคลื่อนในการระบุตำแหน่งแหล่งกำเนิดเสียง 6.79% เมื่อเทียบกับเปอร์เซ็นต์ความคลาดเคลื่อน 47.51% และ 32.29% จากวิธีจุดตัดของเส้นทางเดินเสียง และวิธีหาคำตอบเชิงตัวเลขด้วยการจัดเรียงไมโครโฟนแบบสมมาตร ทั้งนี้สาเหตุมาจากการจัดเรียงไมโครโฟนแบบอสมมาตรส่งผลให้เกิดค่า TDOA ที่แตกต่างกันชัดเจนขึ้นสำหรับแต่ละคู่ไมโครโฟน การนำมาใช้ร่วมกับการหาคำตอบเชิงตัวเลขอาจช่วยให้สามารถหาตำแหน่งแหล่งกำเนิดเสียงทุกตำแหน่งทั้ง Near field และ Far field ได้ วิธีการใหม่สำหรับระบุตำแหน่งแหล่งกำเนิดเสียงใน 3 มิติถูกนำเสนอโดยใช้ไมโครโฟน 4 ตัวจัดเรียงแบบอสมมาตร และไม่ได้อยู่ระนาบเดียวกัน แล้วใช้วิธีเชิงตัวเลขเพื่อหาค่าตำแหน่งแหล่งกำเนิดเสียงที่ทำให้ Objective function มีค่าน้อยที่สุดด้วย Interior point method ซึ่ง Objective function จะถูกสร้างจากค่า TDOA ที่ตรวจจับได้ ร่วมกับสมการ Euclidean distance ในพิกัด 3 มิติ อนึ่งเพื่อเปรียบเทียบประสิทธิภาพของวิธีการใหม่ที่สร้างขึ้นจึงนำวิธีการหาจุดตัดของเส้นทางการเคลื่อนที่ทางตรงของเสียงมาพัฒนาเพิ่มเติมสำหรับใช้หาตำแหน่งในพิกัด 3 มิติ ซึ่งจุดตัดพื้นผิวทรงกรวยทั้ง 3 ถือเป็นตำแหน่งของแหล่งกำเนิดเสียง การทดสอบดำเนินการโดยใช้เสียงอิมพัลส์ 20 ตำแหน่งทั้งใน Near field และ Far field ครอบคลุมพื้นที่ในทุกควอดแดรนท์ และบางตำแหน่งอยู่ที่ทิศทาง 45 องศาจากจุดกำเนิด หรือแนวกึ่งกลางในแต่ละควอดแดรนท์ ผลการทดลองแสดงให้เห็นว่าวิธีจุดตัดของพื้นผิวทรงกรวยให้ผลที่แม่นยำเฉพาะบางตำแหน่ง ซึ่งอยู่ในระยะ Far field และใกล้กับแกนใดแกนหนึ่งเท่านั้น ในทางตรงกันข้ามวิธีการหาคำตอบเชิงตัวเลขสามารถให้ผลการระบุตำแหน่งใน 3 มิติได้ทุกตำแหน่งครอบคลุมทุกควอดแดรนท์ทั้ง Near field และ Far field โดยมีเปอร์เซ็นต์ความผิดพลาดเฉลี่ยเท่ากับ 1.24%
URI:	http://ithesis-ir.su.ac.th/dspace/handle/123456789/6091
Appears in Collections:	Engineering and Industrial Technology

Files in This Item:

File	Description	Size	Format
630920049.pdf		5.1 MB	Adobe PDF	View/Open

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets