Citizenship: Nepal
Permanent Address: Jhalari-7, Kanchanpur, Nepal
Mailing Address: Jeonbuk National University, Jeonju city, South Korea
Date of birth: 25th July, 1988
Phone: (+82)01044049848

Carrier Objectives

An enthusiastic and adaptive person with a broad and acute interest in the discovery of new innovative information technologies. I particularly enjoy collaborating with tech exports from different disciplines of computer science to develop new skills and solve new challenges.


I am a researcher in Computer Science at the Jeonbuk National University, affiliated with the Guru technology research group in Nepal. My current project consists of applying deep learning methods in the field of animal sound behavior analysis, music information retrieval, music video emotion analysis, music source separation, and object detection, with the aim of devising quantitative measures in affective computing and sound localization. My research interest includes multi-level classification, unsupervised learning, self-supervised learning, meta-learning and incremental learning.
I received Ph.D. in Computer Engineering from Jeonbuk National University, South Korea. I worked on animal sound localization and classification, affective computing in music video and music source separation. I proposed five datasets in this domain to train the deep neural network and made public to support new researchers. My thesis work includes music video affective computing with supervised and unsupervised technology. I proposed novel datasets, convolution techniques, and unimodal and multimodal architectures using music, video, and facial expressions from the music video.
Prior to embarking on the insanity of the doctoral studies, I received an ME in Computer Engineering from the Pokhara University, Nepal. In my master thesis work, I worked on data communication and information processing using multiple antenna technology.
A few years before, I received a BE in Computer Engineering from the Pokhara University, with a thesis investigating home security system using microcontroller and sensor technology.


Jeonbuk National University

Major in Machine Learning and Deep Neural Networks
CGPA: 4.43/4.50
Sep. 2017 - Feb. 2021

Pokhara University

Nepal College of Information Technology, Balkumari, Lalitpur(PU Affiliated College)
Rank: Dean's List (CGPA: 3.97/4)
Aug. 2010 to May 2013

Pokhara University(PU)

National Academy of Science and Technology, Dhangadhi, Kailali(PU Affiliated College)
Rank: CGPA: 3.54/4
Aug. 2005 to Oct. 2010

National Examination Board(NEB)

Radiant Secondary School, Mahendranagar, Kanchanpur(Under NEB)(Class XII)

National Examination Board(NEB)

Shree Radha-Krishna Secondary School, Tiltali, Doti(Under Government of Nepal)(Class X)


Fuzzy Logic and Artificial Intelligence Laboratory at Jeonbuk National University

Machine learning and Deep learning based research
Address: Jeonju City, South Korea
Status: Researcher
27 April 2017 - Ongoing

Ministry of Home Affairs (Government of Nepal)

National Information Collection and Transfer
Address: Singhdurbar, Kathmandu, Nepal
Status: IT officer
27 April 2015 - Aug 22 2016

Head of Computer Engineering Department

Assistant Professor and Department head.
Address: Dhangadhi, Kailali, Nepal
Status: Head of Department
12 Feb 2013 - 27 April 2015

World Vision International Nepal

Database management.
Address: Dhangadhi, Kailali, Nepal
Status: IT officer (Internship)
21 Aug. 2010 - 10 Sep. 2010

Rural Village Water Resources Management Project (RVWRMP)

Database management.
Address: Dhangadhi, Kailali, Nepal
Status: IT officer (Internship)
4 Dec. 2009 - 12 Feb. 2010


Music Video Affective Computing (Unsupervised)

Sep. 2020 - Jan. 2021
  • Article title:Music video Emotion Classification using Slow-fast Audio-video Network and Unsupervised Feature Representation.[SCIE Journal]
  • Unsupervised and supervised music video emotion classification dataset
  • Autoencoder architecture with audio adn video information.
  • Slow-fast audio-video network to capture spatial and temporal information of music and video.
  • Train time information sharing and boosting modules.
  • Under review.

Music Video Affective Computing (Supervised)

Sep. 2020 - Jan. 2021
  • Article title:Deep Learning-Based Multimodal Methods for Emotion Classification in Music Video Contents.[SCIE Journal]
  • Music video emotion classification dataset (Inproved and Extended version)
  • Ablation study on unimodla and multimodal using music, video and facial expression.
  • Network complexity reduction using novel channel and filter separable convolution.
  • Train time information sharing and boosting modules.
  • End-to-end training, better result on visual and statistical analysis.
  • Under review.

Facemask States Detection

Nov. 2020 - Jan. 2021
  • Article title:Deep Learning Based Face Mask Status Detection for COVID-19.[SCIE Journal]
  • Semi-automatic visual object labeling tool
  • Facemask detection dataset with three cass categories of with mask, without mask and wrong weared mask.
  • Mask detetion using Faster-RCNN, Cascade FRCNN, FPN and Cascade FPN.
  • Comparision, visualization and analysis of sustem ability and applications.
  • Under review.

Sound Event Labeling Tool

Dec. 2019 - Sep. 2020
  • Article title: A Semi-automatic Sound Annotation Tool for Audio/Video data.[SCIE Journal]
  • Semi-automatic sound event annotation tool using audio and video as input.
  • Automatic event detector is used to detect the audio event.
  • Based on the automatic detector result, an human annotation have to refine the annotation boundary.
  • Easy to use, better audio visualization, python based and output in easy CSV data file.
  • Diversified annotation tool for any rare sound event.
  • Under review.

Music Source Seperation

June 2020 - Nov. 2020
  • Article title:Parallel Stacked Hourglass Network for Music Source Separation.[SCIE Journal]
  • Prepared Korean traditional song (Pansori) dataset with 3 sources.
  • Korean traditional music Pansori dataset, MIR-1K dataset, and DSD100 dataset used in experiment.
  • Proposed a novel parallel stacked hourglass network (PSHN) with multiple band spectrograms.
  • Ablation study on proposed and past architecture.
  • State-of-art result.
  • Puplished on IEEE Access in Nov. 2020

CNN Based Sound Event Detection in Cowshed

Dec. 2019 - Sep. 2020

Cow Sound Event Localization and Classification

Dec. 2019 - Sep. 2020
  • Article title:Visual Object Detector for Cow Sound Event Detection[SCIE Journal]
  • Cow sound event detection dataset with 4 class categories.
  • CNN used for sound event detection using Cow sound dataset and UrbanSound8K dataset.
  • Visual object detection architecture (F-RCNN, CF-RCNN, FPN, C-FPC) used for audio event detection (in Log Mel-Spectrogram).
  • Compare the proposed CNN and Visual object detection architecture using three test dataset.
  • Puplished on IEEE Access in Sep. 2020

Music-Video Emotion Classification

Jan. 2019 - Sep. 2019

Music Video Emotion Analysis

Dec. 2018 - March 2019

Domestic Cat Sound Classification

Dec. 2017 - Sep. 2018

Domestic Cat Sound Classification using Transfer Learning

Dec. 2017 - March 2018


Language Skills

  • English Language
  • Korean Language
  • Hindi
  • Nepali
  • Good
  • Moderate(TOPIK-3)
  • Very good
  • Excellent

Techincal Skills

  • Programming Languages
  • Deep learning Framework
  • Platforms
  • I.D.E Skills
  • Python, C, C++, PHP
  • TensorFlow, Keras, PyTorch
  • Linux, Windows, CUDA/Docker
  • Eclipse, UML, PyCharm


Prof. Joonwhoan Lee

Ph.D. Adviser
Institude: Jeonbuk National University
Ph No.: +82-63-270-2406, +82-010-9855-2406

Prof. Shashidhar Ram Joshi

Master Adviser
Institude: Pokhara University
Ph No.: +977-01-5534070

Mr. SS Mudvari

NAST Engineering College Principal
Institude: National Academy of Science & Technology
Ph No.: +977-91-523312,521312
Email varify at:,