-
DiffIR2VR-Zero: Zero-Shot Video Restoration with Diffusion-based Image Restoration Models
Authors:
Chang-Han Yeh,
Chin-Yang Lin,
Zhixiang Wang,
Chi-Wei Hsiao,
Ting-Hsuan Chen,
Yu-Lun Liu
Abstract:
This paper introduces a method for zero-shot video restoration using pre-trained image restoration diffusion models. Traditional video restoration methods often need retraining for different settings and struggle with limited generalization across various degradation types and datasets. Our approach uses a hierarchical token merging strategy for keyframes and local frames, combined with a hybrid c…
▽ More
This paper introduces a method for zero-shot video restoration using pre-trained image restoration diffusion models. Traditional video restoration methods often need retraining for different settings and struggle with limited generalization across various degradation types and datasets. Our approach uses a hierarchical token merging strategy for keyframes and local frames, combined with a hybrid correspondence mechanism that blends optical flow and feature-based nearest neighbor matching (latent merging). We show that our method not only achieves top performance in zero-shot video restoration but also significantly surpasses trained models in generalization across diverse datasets and extreme degradations (8$\times$ super-resolution and high-standard deviation video denoising). We present evidence through quantitative metrics and visual comparisons on various challenging datasets. Additionally, our technique works with any 2D restoration diffusion model, offering a versatile and powerful tool for video enhancement tasks without extensive retraining. This research leads to more efficient and widely applicable video restoration technologies, supporting advancements in fields that require high-quality video output. See our project page for video results at https://jimmycv07.github.io/DiffIR2VR_web/.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Explainable AI models for predicting liquefaction-induced lateral spreading
Authors:
Cheng-Hsi Hsiao,
Krishna Kumar,
Ellen Rathje
Abstract:
Earthquake-induced liquefaction can cause substantial lateral spreading, posing threats to infrastructure. Machine learning (ML) can improve lateral spreading prediction models by capturing complex soil characteristics and site conditions. However, the "black box" nature of ML models can hinder their adoption in critical decision-making. This study addresses this limitation by using SHapley Additi…
▽ More
Earthquake-induced liquefaction can cause substantial lateral spreading, posing threats to infrastructure. Machine learning (ML) can improve lateral spreading prediction models by capturing complex soil characteristics and site conditions. However, the "black box" nature of ML models can hinder their adoption in critical decision-making. This study addresses this limitation by using SHapley Additive exPlanations (SHAP) to interpret an eXtreme Gradient Boosting (XGB) model for lateral spreading prediction, trained on data from the 2011 Christchurch Earthquake. SHAP analysis reveals the factors driving the model's predictions, enhancing transparency and allowing for comparison with established engineering knowledge. The results demonstrate that the XGB model successfully identifies the importance of soil characteristics derived from Cone Penetration Test (CPT) data in predicting lateral spreading, validating its alignment with domain understanding. This work highlights the value of explainable machine learning for reliable and informed decision-making in geotechnical engineering and hazard assessment.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
MENTOR: Multilingual tExt detectioN TOward leaRning by analogy
Authors:
Hsin-Ju Lin,
Tsu-Chun Chung,
Ching-Chun Hsiao,
Pin-Yu Chen,
Wei-Chen Chiu,
Ching-Chun Huang
Abstract:
Text detection is frequently used in vision-based mobile robots when they need to interpret texts in their surroundings to perform a given task. For instance, delivery robots in multilingual cities need to be capable of doing multilingual text detection so that the robots can read traffic signs and road markings. Moreover, the target languages change from region to region, implying the need of eff…
▽ More
Text detection is frequently used in vision-based mobile robots when they need to interpret texts in their surroundings to perform a given task. For instance, delivery robots in multilingual cities need to be capable of doing multilingual text detection so that the robots can read traffic signs and road markings. Moreover, the target languages change from region to region, implying the need of efficiently re-training the models to recognize the novel/new languages. However, collecting and labeling training data for novel languages are cumbersome, and the efforts to re-train an existing/trained text detector are considerable. Even worse, such a routine would repeat whenever a novel language appears. This motivates us to propose a new problem setting for tackling the aforementioned challenges in a more efficient way: "We ask for a generalizable multilingual text detection framework to detect and identify both seen and unseen language regions inside scene images without the requirement of collecting supervised training data for unseen languages as well as model re-training". To this end, we propose "MENTOR", the first work to realize a learning strategy between zero-shot learning and few-shot learning for multilingual scene text detection.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
Climate Trends of Tropical Cyclone Intensity and Energy Extremes Revealed by Deep Learning
Authors:
Buo-Fu Chen,
Boyo Chen,
Chun-Min Hsiao,
Hsu-Feng Teng,
Cheng-Shang Lee,
Hung-Chi Kuo
Abstract:
Anthropogenic influences have been linked to tropical cyclone (TC) poleward migration, TC extreme precipitation, and an increased proportion of major hurricanes [1, 2, 3, 4]. Understanding past TC trends and variability is critical for projecting future TC impacts on human society considering the changing climate [5]. However, past trends of TC structure/energy remain uncertain due to limited obse…
▽ More
Anthropogenic influences have been linked to tropical cyclone (TC) poleward migration, TC extreme precipitation, and an increased proportion of major hurricanes [1, 2, 3, 4]. Understanding past TC trends and variability is critical for projecting future TC impacts on human society considering the changing climate [5]. However, past trends of TC structure/energy remain uncertain due to limited observations; subjective-analyzed and spatiotemporal-heterogeneous "best-track" datasets lead to reduced confidence in the assessed TC repose to climate change [6, 7]. Here, we use deep learning to reconstruct past "observations" and yield an objective global TC wind profile dataset during 1981 to 2020, facilitating a comprehensive examination of TC structure/energy. By training with uniquely labeled data integrating best tracks and numerical model analysis of 2004 to 2018 TCs, our model converts multichannel satellite imagery to a 0-750-km wind profile of axisymmetric surface winds. The model performance is verified to be sufficient for climate studies by comparing it to independent satellite-radar surface winds. Based on the new homogenized dataset, the major TC proportion has increased by ~13% in the past four decades. Moreover, the proportion of extremely high-energy TCs has increased by ~25%, along with an increasing trend (> one standard deviation of the 40-y variability) of the mean total energy of high-energy TCs. Although the warming ocean favors TC intensification, the TC track migration to higher latitudes and altered environments further affect TC structure/energy. This new deep learning method/dataset reveals novel trends regarding TC structure extremes and may help verify simulations/studies regarding TCs in the changing climate.
△ Less
Submitted 1 February, 2024;
originally announced February 2024.
-
Distributed Monitoring for Data Distribution Shifts in Edge-ML Fraud Detection
Authors:
Nader Karayanni,
Robert J. Shahla,
Chieh-Lien Hsiao
Abstract:
The digital era has seen a marked increase in financial fraud. edge ML emerged as a promising solution for smartphone payment services fraud detection, enabling the deployment of ML models directly on edge devices. This approach enables a more personalized real-time fraud detection. However, a significant gap in current research is the lack of a robust system for monitoring data distribution shift…
▽ More
The digital era has seen a marked increase in financial fraud. edge ML emerged as a promising solution for smartphone payment services fraud detection, enabling the deployment of ML models directly on edge devices. This approach enables a more personalized real-time fraud detection. However, a significant gap in current research is the lack of a robust system for monitoring data distribution shifts in these distributed edge ML applications. Our work bridges this gap by introducing a novel open-source framework designed for continuous monitoring of data distribution shifts on a network of edge devices. Our system includes an innovative calculation of the Kolmogorov-Smirnov (KS) test over a distributed network of edge devices, enabling efficient and accurate monitoring of users behavior shifts. We comprehensively evaluate the proposed framework employing both real-world and synthetic financial transaction datasets and demonstrate the framework's effectiveness.
△ Less
Submitted 10 January, 2024;
originally announced January 2024.
-
Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision
Authors:
Chih-Kai Yang,
Kuan-Po Huang,
Ke-Han Lu,
Chun-Yi Kuan,
Chi-Yuan Hsiao,
Hung-yi Lee
Abstract:
This work evaluated several cutting-edge large-scale foundation models based on self-supervision or weak supervision, including SeamlessM4T, SeamlessM4T v2, and Whisper-large-v3, on three code-switched corpora. We found that self-supervised models can achieve performances close to the supervised model, indicating the effectiveness of multilingual self-supervised pre-training. We also observed that…
▽ More
This work evaluated several cutting-edge large-scale foundation models based on self-supervision or weak supervision, including SeamlessM4T, SeamlessM4T v2, and Whisper-large-v3, on three code-switched corpora. We found that self-supervised models can achieve performances close to the supervised model, indicating the effectiveness of multilingual self-supervised pre-training. We also observed that these models still have room for improvement as they kept making similar mistakes and had unsatisfactory performances on modeling intra-sentential code-switching. In addition, the validity of several variants of Whisper was explored, and we concluded that they remained effective in a code-switching scenario, and similar techniques for self-supervised models are worth studying to boost the performance of code-switched tasks.
△ Less
Submitted 30 December, 2023;
originally announced January 2024.
-
Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech
Authors:
Chien-yu Huang,
Ke-Han Lu,
Shih-Heng Wang,
Chi-Yuan Hsiao,
Chun-Yi Kuan,
Haibin Wu,
Siddhant Arora,
Kai-Wei Chang,
Jiatong Shi,
Yifan Peng,
Roshan Sharma,
Shinji Watanabe,
Bhiksha Ramakrishnan,
Shady Shehata,
Hung-yi Lee
Abstract:
Text language models have shown remarkable zero-shot capability in generalizing to unseen tasks when provided with well-formulated instructions. However, existing studies in speech processing primarily focus on limited or specific tasks. Moreover, the lack of standardized benchmarks hinders a fair comparison across different approaches. Thus, we present Dynamic-SUPERB, a benchmark designed for bui…
▽ More
Text language models have shown remarkable zero-shot capability in generalizing to unseen tasks when provided with well-formulated instructions. However, existing studies in speech processing primarily focus on limited or specific tasks. Moreover, the lack of standardized benchmarks hinders a fair comparison across different approaches. Thus, we present Dynamic-SUPERB, a benchmark designed for building universal speech models capable of leveraging instruction tuning to perform multiple tasks in a zero-shot fashion. To achieve comprehensive coverage of diverse speech tasks and harness instruction tuning, we invite the community to collaborate and contribute, facilitating the dynamic growth of the benchmark. To initiate, Dynamic-SUPERB features 55 evaluation instances by combining 33 tasks and 22 datasets. This spans a broad spectrum of dimensions, providing a comprehensive platform for evaluation. Additionally, we propose several approaches to establish benchmark baselines. These include the utilization of speech models, text language models, and the multimodal encoder. Evaluation results indicate that while these baselines perform reasonably on seen tasks, they struggle with unseen ones. We release all materials to the public and welcome researchers to collaborate on the project, advancing technologies in the field together.
△ Less
Submitted 22 March, 2024; v1 submitted 18 September, 2023;
originally announced September 2023.
-
ExReg: Wide-range Photo Exposure Correction via a Multi-dimensional Regressor with Attention
Authors:
Tzu-Hao Chiang,
Hao-Chien Hsueh,
Ching-Chun Hsiao,
Ching-Chun Huang
Abstract:
Photo exposure correction is widely investigated, but fewer studies focus on correcting under and over-exposed images simultaneously. Three issues remain open to handle and correct under and over-exposed images in a unified way. First, a locally-adaptive exposure adjustment may be more flexible instead of learning a global mapping. Second, it is an ill-posed problem to determine the suitable expos…
▽ More
Photo exposure correction is widely investigated, but fewer studies focus on correcting under and over-exposed images simultaneously. Three issues remain open to handle and correct under and over-exposed images in a unified way. First, a locally-adaptive exposure adjustment may be more flexible instead of learning a global mapping. Second, it is an ill-posed problem to determine the suitable exposure values locally. Third, photos with the same content but different exposures may not reach consistent adjustment results. To this end, we proposed a novel exposure correction network, ExReg, to address the challenges by formulating exposure correction as a multi-dimensional regression process. Given an input image, a compact multi-exposure generation network is introduced to generate images with different exposure conditions for multi-dimensional regression and exposure correction in the next stage. An auxiliary module is designed to predict the region-wise exposure values, guiding the mainly proposed Encoder-Decoder ANP (Attentive Neural Processes) to regress the final corrected image. The experimental results show that ExReg can generate well-exposed results and outperform the SOTA method by 1.3dB in PSNR for extensive exposure problems. In addition, given the same image but under various exposure for testing, the corrected results are more visually consistent and physically accurate.
△ Less
Submitted 14 December, 2022;
originally announced December 2022.
-
Specialize and Fuse: Pyramidal Output Representation for Semantic Segmentation
Authors:
Chi-Wei Hsiao,
Cheng Sun,
Hwann-Tzong Chen,
Min Sun
Abstract:
We present a novel pyramidal output representation to ensure parsimony with our "specialize and fuse" process for semantic segmentation. A pyramidal "output" representation consists of coarse-to-fine levels, where each level is "specialize" in a different class distribution (e.g., more stuff than things classes at coarser levels). Two types of pyramidal outputs (i.e., unity and semantic pyramid) a…
▽ More
We present a novel pyramidal output representation to ensure parsimony with our "specialize and fuse" process for semantic segmentation. A pyramidal "output" representation consists of coarse-to-fine levels, where each level is "specialize" in a different class distribution (e.g., more stuff than things classes at coarser levels). Two types of pyramidal outputs (i.e., unity and semantic pyramid) are "fused" into the final semantic output, where the unity pyramid indicates unity-cells (i.e., all pixels in such cell share the same semantic label). The process ensures parsimony by predicting a relatively small number of labels for unity-cells (e.g., a large cell of grass) to build the final semantic output. In addition to the "output" representation, we design a coarse-to-fine contextual module to aggregate the "features" representation from different levels. We validate the effectiveness of each key module in our method through comprehensive ablation studies. Finally, our approach achieves state-of-the-art performance on three widely-used semantic segmentation datasets -- ADE20K, COCO-Stuff, and Pascal-Context.
△ Less
Submitted 19 August, 2021; v1 submitted 4 August, 2021;
originally announced August 2021.
-
Indoor Panorama Planar 3D Reconstruction via Divide and Conquer
Authors:
Cheng Sun,
Chi-Wei Hsiao,
Ning-Hsu Wang,
Min Sun,
Hwann-Tzong Chen
Abstract:
Indoor panorama typically consists of human-made structures parallel or perpendicular to gravity. We leverage this phenomenon to approximate the scene in a 360-degree image with (H)orizontal-planes and (V)ertical-planes. To this end, we propose an effective divide-and-conquer strategy that divides pixels based on their plane orientation estimation; then, the succeeding instance segmentation module…
▽ More
Indoor panorama typically consists of human-made structures parallel or perpendicular to gravity. We leverage this phenomenon to approximate the scene in a 360-degree image with (H)orizontal-planes and (V)ertical-planes. To this end, we propose an effective divide-and-conquer strategy that divides pixels based on their plane orientation estimation; then, the succeeding instance segmentation module conquers the task of planes clustering more easily in each plane orientation group. Besides, parameters of V-planes depend on camera yaw rotation, but translation-invariant CNNs are less aware of the yaw change. We thus propose a yaw-invariant V-planar reparameterization for CNNs to learn. We create a benchmark for indoor panorama planar reconstruction by extending existing 360 depth datasets with ground truth H\&V-planes (referred to as PanoH&V dataset) and adopt state-of-the-art planar reconstruction methods to predict H\&V-planes as our baselines. Our method outperforms the baselines by a large margin on the proposed dataset.
△ Less
Submitted 9 September, 2021; v1 submitted 27 June, 2021;
originally announced June 2021.
-
CNN Profiler on Polar Coordinate Images for Tropical Cyclone Structure Analysis
Authors:
Boyo Chen,
Buo-Fu Chen,
Chun-Min Hsiao
Abstract:
Convolutional neural networks (CNN) have achieved great success in analyzing tropical cyclones (TC) with satellite images in several tasks, such as TC intensity estimation. In contrast, TC structure, which is conventionally described by a few parameters estimated subjectively by meteorology specialists, is still hard to be profiled objectively and routinely. This study applies CNN on satellite ima…
▽ More
Convolutional neural networks (CNN) have achieved great success in analyzing tropical cyclones (TC) with satellite images in several tasks, such as TC intensity estimation. In contrast, TC structure, which is conventionally described by a few parameters estimated subjectively by meteorology specialists, is still hard to be profiled objectively and routinely. This study applies CNN on satellite images to create the entire TC structure profiles, covering all the structural parameters. By utilizing the meteorological domain knowledge to construct TC wind profiles based on historical structure parameters, we provide valuable labels for training in our newly released benchmark dataset. With such a dataset, we hope to attract more attention to this crucial issue among data scientists. Meanwhile, a baseline is established with a specialized convolutional model operating on polar-coordinates. We discovered that it is more feasible and physically reasonable to extract structural information on polar-coordinates, instead of Cartesian coordinates, according to a TC's rotational and spiral natures. Experimental results on the released benchmark dataset verified the robustness of the proposed model and demonstrated the potential for applying deep learning techniques for this barely developed yet important topic.
△ Less
Submitted 28 October, 2020;
originally announced October 2020.
-
Flat2Layout: Flat Representation for Estimating Layout of General Room Types
Authors:
Chi-Wei Hsiao,
Cheng Sun,
Min Sun,
Hwann-Tzong Chen
Abstract:
This paper proposes a new approach, Flat2Layout, for estimating general indoor room layout from a single-view RGB image whereas existing methods can only produce layout topologies captured from the box-shaped room. The proposed flat representation encodes the layout information into row vectors which are treated as the training target of the deep model. A dynamic programming based postprocessing i…
▽ More
This paper proposes a new approach, Flat2Layout, for estimating general indoor room layout from a single-view RGB image whereas existing methods can only produce layout topologies captured from the box-shaped room. The proposed flat representation encodes the layout information into row vectors which are treated as the training target of the deep model. A dynamic programming based postprocessing is employed to decode the estimated flat output from the deep model into the final room layout. Flat2Layout achieves state-of-the-art performance on existing room layout benchmark. This paper also constructs a benchmark for validating the performance on general layout topologies, where Flat2Layout achieves good performance on general room types. Flat2Layout is applicable on more scenario for layout estimation and would have an impact on applications of Scene Modeling, Robotics, and Augmented Reality.
△ Less
Submitted 29 May, 2019;
originally announced May 2019.
-
HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation
Authors:
Cheng Sun,
Chi-Wei Hsiao,
Min Sun,
Hwann-Tzong Chen
Abstract:
We present a new approach to the problem of estimating the 3D room layout from a single panoramic image. We represent room layout as three 1D vectors that encode, at each image column, the boundary positions of floor-wall and ceiling-wall, and the existence of wall-wall boundary. The proposed network, HorizonNet, trained for predicting 1D layout, outperforms previous state-of-the-art approaches. T…
▽ More
We present a new approach to the problem of estimating the 3D room layout from a single panoramic image. We represent room layout as three 1D vectors that encode, at each image column, the boundary positions of floor-wall and ceiling-wall, and the existence of wall-wall boundary. The proposed network, HorizonNet, trained for predicting 1D layout, outperforms previous state-of-the-art approaches. The designed post-processing procedure for recovering 3D room layouts from 1D predictions can automatically infer the room shape with low computation cost - it takes less than 20ms for a panorama image while prior works might need dozens of seconds. We also propose Pano Stretch Data Augmentation, which can diversify panorama data and be applied to other panorama-related learning tasks. Due to the limited data available for non-cuboid layout, we relabel 65 general layout from the current dataset for finetuning. Our approach shows good performance on general layouts by qualitative results and cross-validation.
△ Less
Submitted 6 April, 2019; v1 submitted 12 January, 2019;
originally announced January 2019.
-
Representation Learning for Image-based Music Recommendation
Authors:
Chih-Chun Hsia,
Kwei-Herng Lai,
Yian Chen,
Chuan-Ju Wang,
Ming-Feng Tsai
Abstract:
Image perception is one of the most direct ways to provide contextual information about a user concerning his/her surrounding environment; hence images are a suitable proxy for contextual recommendation. We propose a novel representation learning framework for image-based music recommendation that bridges the heterogeneity gap between music and image data; the proposed method is a key component fo…
▽ More
Image perception is one of the most direct ways to provide contextual information about a user concerning his/her surrounding environment; hence images are a suitable proxy for contextual recommendation. We propose a novel representation learning framework for image-based music recommendation that bridges the heterogeneity gap between music and image data; the proposed method is a key component for various contextual recommendation tasks. Preliminary experiments show that for an image-to-song retrieval task, the proposed method retrieves relevant or conceptually similar songs for input images.
△ Less
Submitted 29 August, 2018; v1 submitted 28 August, 2018;
originally announced August 2018.
-
Intelligent decision: towards interpreting the Pe Algorithm
Authors:
Ching-an Hsiao,
Xinchun Tian
Abstract:
The human intelligence lies in the algorithm, the nature of algorithm lies in the classification, and the classification is equal to outlier detection. A lot of algorithms have been proposed to detect outliers, meanwhile a lot of definitions. Unsatisfying point is that definitions seem vague, which makes the solution an ad hoc one. We analyzed the nature of outliers, and give two clear definitions…
▽ More
The human intelligence lies in the algorithm, the nature of algorithm lies in the classification, and the classification is equal to outlier detection. A lot of algorithms have been proposed to detect outliers, meanwhile a lot of definitions. Unsatisfying point is that definitions seem vague, which makes the solution an ad hoc one. We analyzed the nature of outliers, and give two clear definitions. We then develop an efficient RDD algorithm, which converts outlier problem to pattern and degree problem. Furthermore, a collapse mechanism was introduced by IIR algorithm, which can be united seamlessly with the RDD algorithm and serve for the final decision. Both algorithms are originated from the study on general AI. The combined edition is named as Pe algorithm, which is the basis of the intelligent decision. Here we introduce longest k-turn subsequence problem and corresponding solution as an example to interpret the function of Pe algorithm in detecting curve-type outliers. We also give a comparison between IIR algorithm and Pe algorithm, where we can get a better understanding at both algorithms. A short discussion about intelligence is added to demonstrate the function of the Pe algorithm. Related experimental results indicate its robustness.
△ Less
Submitted 22 August, 2011; v1 submitted 9 June, 2011;
originally announced June 2011.
-
How does certainty enter into the mind?
Authors:
Ching-an Hsiao
Abstract:
Any problem is concerned with the mind, but what do minds make a decision on? Here we show that there are three conditions for the mind to make a certain answer. We found that some difficulties in physics and mathematics are in fact introduced by infinity, which can not be rightly expressed by minds. Based on this point, we suggest a general observation system, where we use region (a type of inf…
▽ More
Any problem is concerned with the mind, but what do minds make a decision on? Here we show that there are three conditions for the mind to make a certain answer. We found that some difficulties in physics and mathematics are in fact introduced by infinity, which can not be rightly expressed by minds. Based on this point, we suggest a general observation system, where we use region (a type of infinity) to substitute for infinitesimal (another type of infinity) and thus get a consistent image with the mind. Furthermore, we declare that without world pictures we can never have ideas to any expressive events, which is the primary condition for a wave function like mind to collapse to a series of numbers. A following observation by expanding algorithm brings the final collapse: classifying the numbers and coming up with a certain yes or no answer.
△ Less
Submitted 23 March, 2010; v1 submitted 9 September, 2009;
originally announced September 2009.
-
On Classification from Outlier View
Authors:
C. A. Hsiao
Abstract:
Classification is the basis of cognition. Unlike other solutions, this study approaches it from the view of outliers. We present an expanding algorithm to detect outliers in univariate datasets, together with the underlying foundation. The expanding algorithm runs in a holistic way, making it a rather robust solution. Synthetic and real data experiments show its power. Furthermore, an application…
▽ More
Classification is the basis of cognition. Unlike other solutions, this study approaches it from the view of outliers. We present an expanding algorithm to detect outliers in univariate datasets, together with the underlying foundation. The expanding algorithm runs in a holistic way, making it a rather robust solution. Synthetic and real data experiments show its power. Furthermore, an application for multi-class problems leads to the introduction of the oscillator algorithm. The corresponding result implies the potential wide use of the expanding algorithm.
△ Less
Submitted 2 January, 2012; v1 submitted 29 July, 2009;
originally announced July 2009.
-
Enhanced Sensing Characteristics in MEMS-based Formaldehyde Gas Sensor
Authors:
Yu-Hsiang Wang,
C. -C. Hsiao,
Chia-Yen Lee,
R. -H. Ma,
Po-Cheng Chou
Abstract:
This study has successfully demonstrated a novel self-heating formaldehyde gas sensor based on a thin film of NiO sensing layer. A new fabrication process has been developed in which the Pt micro heater and electrodes are deposited directly on the substrate and the NiO thin film is deposited above on the micro heater to serve as sensing layer. Pt electrodes are formed below the sensing layer to…
▽ More
This study has successfully demonstrated a novel self-heating formaldehyde gas sensor based on a thin film of NiO sensing layer. A new fabrication process has been developed in which the Pt micro heater and electrodes are deposited directly on the substrate and the NiO thin film is deposited above on the micro heater to serve as sensing layer. Pt electrodes are formed below the sensing layer to measure the electrical conductivity changes caused by formaldehyde oxidation at the oxide surface. Furthermore, the upper sensing layer and NiO/Al2O3 co-sputtering significantly increases the sensitivity of the gas sensor, improves its detection limit capability. The microfabricated formaldehyde gas sensor presented in this study is suitable not only for industrial process monitoring, but also for the detection of formaldehyde concentrations in buildings in order to safeguard human health.
△ Less
Submitted 21 February, 2008;
originally announced February 2008.