TÌM KIẾM ẢNH SỬ DỤNG MẠNG NƠRON TÍCH CHẬP VÀ ĐỒ THỊ PHÂN CỤM

Phạm Hoàng Phương; Đỗ Xuân Hiệp; Nguyễn Thị Định; Văn Thế Thành

doi:10.54607/hcmue.js.20.7.3615(2023)

TÌM KIẾM ẢNH SỬ DỤNG MẠNG NƠRON TÍCH CHẬP VÀ ĐỒ THỊ PHÂN CỤM

Phạm Hoàng Phương, Đỗ Xuân Hiệp, Nguyễn Thị Định, Văn Thế Thành

Tóm tắt

Trong bài báo này, một mô hình tìm kiếm ảnh dựa trên mạng nơron tích chập kết hợp cấu trúc đồ thị cụm được thực hiện nhằm nâng cao hiệu suất và giảm thời gian truy vấn ảnh. Để thực hiện bài toán này: (1) mạng Noron tích chập được sử dụng để xác định và phân loại các đối tượng trên ảnh; (2) cấu trúc đồ thị cụm được xây dựng để thực hiện xây dựng ontology; (3) tập ảnh tương tự được trích xuất dựa trên ontology sau thực hiện khi tìm kiếm bằng câu truy vấn SPARQL. Với mỗi ảnh đầu vào, sau khi phân loại từng đối tượng bằng mạng Noron tích chập; trích xuất vector đặc trưng; phân lớp ảnh và thực hiện truy vấn ontology để trích xuất tập ảnh tương tự. Trên cơ sở lí thuyết đề xuất, một mô hình truy vấn ảnh được đề xuất và thực nghiệm trên bộ ảnh COCO, Flickr với độ chính xác tương ứng lần lượt là 0.7950, 0.8116. Theo kết quả thực nghiệm, phương pháp đề xuất của chúng tôi được đánh giá là đúng đắn và so sánh với các công trình khác trên cùng bộ ảnh nhằm đánh giá tính hiệu quả của mô hình đề xuất; đồng thời áp dụng được cho các bộ dữ liệu
khác nhau.

Từ khóa

convolutional neural networks; image retrieval; similar images; SPARQL

Toàn văn:

PDF

Trích dẫn

Asim, M. N., Wasim, M., Khan, M. U. G., Mahmood, N., &Mahmood, W. (2019). The use of ontology in retrieval: a study on textual, multilingual, and multimedia retrieval. IEEE Access, 7, 21662-21686.

Bharati, P., & Pramanik, A. (2020). Deep learning techniques—R-CNN to mask R-CNN: A survey. Computational Intelligence in Pattern Recognition: Proceedings of CIPR 2019, 657-668.

Cao, Y., Long, M., Wang, J., Yang, Q., & Yu, P. S. (2016). Deep visual-semantic hashing for cross-modal retrieval. Paper presented at the Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.

Cevikalp, H., Elmas, M., & Ozkan, S. (2018). Large-scale image retrieval using transductive support vector machines. Computer Vision and Image Understanding, 173, 2-12.

Dinh, N. T., Le, T. M., & Van, T. T. (2022). An Improvement Method of Kd-Tree Using k-Means and k-NN for Semantic-Based Image Retrieval System Information Systems and Technologies: WorldCIST 2022, Volume 2 (pp. 177-187). Springer.

Dinh, N. T., Nhi, N. T. U., Le, T. M., & Van, T. T. (2023). A model of image retrieval based on KD-Tree Random Forest. Data Technologies and Applications.

Flickr. (2017). Dataset Flickr 2017. Retrieved from https://www.kaggle.com/datasets/hsankesara/flickr-image-dataset

Jabeen, S., Mehmood, Z., Mahmood, T., Saba, T., Rehman, A., & Mahmood, M. T. (2018). An effective content-based image retrieval technique for image visuals representation based on the bag-of-visual-words model. PloS one, 13(4), e0194526.

Kumar, A., Dyer, S., Kim, J., Li, C., Leong, P. H., Fulham, M., & Feng, D. (2016). Adapting content-based image retrieval techniques for the semantic annotation of medical images. Computerized Medical Imaging and Graphics, 49, 37-45. DOI:10.1016/j.compmedimag.2016.01.001

Lin, C. H., Chen, C. C., Lee, H. L., & Liao, J. R. (2014). Fast K-means algorithm based on a level histogram for image retrieval. Expert Systems with Applications, 41(7), 3276-3283.

Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., &Zitnick, C. L. (2014). Microsoft coco: Common objects in context. Paper presented at the Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13.

MS-COCO. (2017). Dataset MS-COCO 2017. Retrieved from https://www.kaggle.com/datasets/awsaf49/coco-2017-dataset?resource=download

Ptucha, R., Such, F. P., Pillai, S., Brockler, F., Singh, V., & Hutkowski, P. (2019). Intelligent character recognition using fully convolutional neural networks. Pattern Recognition, 88,

-613.

Saboorian, M. M., Jamzad, M., & Rabiee, H. R. (2010). User adaptive clustering for large image databases. Paper presented at the 2010 20th International Conference on Pattern Recognition.

Song, J., He, T., Gao, L., Xu, X., Hanjalic, A., & Shen, H. T. (2018). Binary generative adversarial networks for image retrieval. Paper presented at the Proceedings of the AAAI Conference on Artificial Intelligence.

Spanier, A. B., & Joskowicz, D. C., L. (2017). A new method for the automatic retrieval of medical cases based on the RadLex ontology. International journal of computer assisted radiology and surgery, 12(3), 471-484.

Vijayarajan, V., Dinakaran, M., Tejaswin, P., & Lohani, M. (2016). A generic framework for ontology-based information retrieval and image retrieval in web data. Human-centric Computing and Information Sciences, 6(1), 1-30.

Wang, Z., Liu, X., Li, H., Sheng, L., Yan, J., Wang, X., & Shao, J. (2019). Camp: Cross-modal adaptive message passing for text-image retrieval. Paper presented at the Proceedings of the IEEE/CVF international conference on computer vision.

Yao, B. Z., Yang, X., Lin, L., Lee, M. W., & Zhu, S. C. (2010). I2t: Image parsing to text description. Proceedings of the IEEE, 98(8), 1485-1508.

DOI: https://doi.org/10.54607/hcmue.js.20.7.3615(2023)

Tình trạng

Danh sách trống

Tên đăng nhập
Mật khẩu
Ghi nhớ