Colloquium on Statistics in GSES in Osaka University

シグマ統計コロキウム

Colloquium on Statistics in GSES in Osaka University

[もどる]

基礎工学研究科統計学研究グループが主催するセミナーで，主に外部の講演者を招聘します． ときにスタッフなども発表します．興味をおもちの方はどなたでも参加できます．
第07回平成25年11月25日(金) 15:30～16:30　於: 基礎工J棟数理ディスプレイ室(J617)

講演者： 長尾大道氏　(東京大学)
演　題：データ同化の基礎と固体地球科学への応用

概　要： データ同化は、数値シミュレーションと観測／計測データを、ベイズ統計学のフレームワークで融合するための手法であり、様々な科学分野に浸透しつつある。本セミナーでは、データ同化の基礎理論について述べた上で、固体地球科学分野への具体的な応用例を紹介する。
[世話人：廣瀬慧]
第06回平成25年3月26日(火) 14:40～16:10　於: 基礎工J棟数理ディスプレイ室(J617)

講演者： 金森敬文氏　(名古屋大学)
演　題： Semi-Supervised learning with Density-Ratio Estimation

概　要： We study statistical properties of semi-supervised learning, which is considered to be an important problem in the field of machine learning. In standard supervised learning, only labeled data is observed, and classification and regression problems are formalized as supervised learning. On the other hand, in semi-supervised learning, unlabeled data is also obtained in addition to labeled data. Hence, the ability to exploit unlabeled data is important to improve prediction accuracy in semi-supervised learning. This problem is regarded as a semiparametric estimation problem with missing data. Under discriminative probabilistic models, it was considered that the unlabeled data is useless to improve the estimation accuracy. Recently, the weighted estimator using the unlabeled data achieves a better prediction accuracy compared to the learning method using only labeled data, especially when the discriminative probabilistic model is misspecified. That is, the improvement under the semiparametric model with missing data is possible, when the semiparametric model is misspecified. In this paper, we apply the density-ratio estimator to obtain the weight function in semi-supervised learning. Our approach is advantageous because the proposed estimator does not require well-specified probabilistic models for the probability of the unlabeled data. Based on the statistical asymptotic theory, we prove that the estimation accuracy of our method outperforms supervised learning using only the labeled data. Some numerical experiments present the usefulness of our methods.
[世話人：下平英寿]
第05回平成25年3月11日(月) 14:40～16:10　於: 基礎工J棟数理ディスプレイ室(J617)

講演者： Ruriko Yoshida (University of Kentucky)
演　題： Optimality of the Neighbor Joining Algorithm and Faces of the Balanced Minimum Evolution Polytope

概　要： Balanced minimum evolution (BME) is a statistically consistent distance-based method to reconstruct a phylogenetic tree from an alignment of molecular data. In 2000, Pauplin showed that the BME method is equivalent to optimizing a linear functional over the BME polytope, the convex hull of the BME vectors obtained from Pauplin's formula applied to all binary trees. The BME method is related to the Neighbor Joining (NJ) algorithm, now known to be a greedy optimization of the BME principle. Further, the NJ and BME algorithms have been studied previously to understand when the NJ Algorithm returns a BME tree for small numbers of taxa. In this talk we aim to elucidate the structure of the BME polytope and strengthen knowledge of the connection between the BME method and NJ Algorithm. We first prove that any subtree-prune-regraft move from a binary tree to another binary tree corresponds to an edge of the BME polytope. Moreover, we describe an entire family of faces parametrized by disjoint clades. We show that these {\em clade-faces} are smaller dimensional BME polytopes themselves. Finally, we show that for any order of joining nodes to form a tree, there exists an associated distance matrix (i.e., dissimilarity map) for which the NJ Algorithm returns the BME tree. More strongly, we show that the BME cone and every NJ cone associated to a tree $T$ have an intersection of positive measure.
[世話人：下平英寿]
第04回平成24年9月21日(金) 15:20～17:30　於: 基礎工J棟数理大セミナー室(J706)

講演者： Christophe Ambroise (Laboratoire Statistique et Genome, CNRS)
演　題： Inferring Sparse Gaussian Graphical Models for Biological Network

概　要： Gaussian Graphical Models provide a convenient framework for representing dependencies between variables. In this framework, a set of variables is represented by an undirected graph, where vertices correspond to variables, and an edge connects two vertices if the corresponding pair of variables are dependent, conditional on the remaining ones. Recently, this tool has received a high interest for the discovery of biological networks by l1-penalization of the model likelihood. In this presentation, we introduce various ways of inferring sparse co-expression networks based on partial correlation coefficients from either steady-state or time-course transcriptomic data. All proposals search for a latent structure of the network to drive the selection of edges through an adaptive l1-penalization of the model likelihood. We focus on inference from samples collected in different experimental conditions and therefore not identically distributed.
[世話人：下平英寿]

講演者： 松井秀俊氏 (九州大学数理学研究院)
演　題：関数データに基づく統計的モデリングと変数選択

概　要： 関数データに基づく回帰，判別手法について報告する．一つの個体に対して，データが時間の経過や位置の変化に伴って複数観測されたとき，これを時間や位置の関数とみなして扱う方法は関数データ解析(FDA;Functional Data Analysis)とよばれている．本研究では，関数データとして与えられた説明変数と，スカラーで与えられた目的変数との関係をモデル化する関数回帰モデリングについて紹介する．特に，説明変数が複数与えられたとき，スパース正則化に基づきモデル推定と変数選択を同時に行う方法について述べる．また，関数データに基づくロジスティック回帰モデルに対してスパース正則化を適用することで，判別に影響を与えている変数の選択を行う方法についても述べる．判別手法を，経時観測された多発性硬化症患者の遺伝子発現データに適用し，遺伝子選択を試みる．
[世話人：廣瀬慧]
第03回平成24年9月25日(火) 16:00～17:30　於: 基礎工J棟数理大セミナー室(J706)

講演者： Christophe Ambroise (Laboratoire Statistique et Genome, CNRS)
演　題： New consistent and asymptotically normal parameter estimates for random-graph mixture models

概　要： Random-graph mixture models are very popular for modelling real data networks. Parameter estimation procedures usually rely on variational approximations, either combined with the expectation?maximization (EM) algorithm or with Bayesian approaches. Despite good results on synthetic data, the validity of the variational approximation is, however, not established. Moreover, these variational approaches aim at approximating the maximum likelihood or the maximum a posteriori estimators, whose behaviour in an asymptotic framework (as the sample size increases to infinity) remains unknown for these models. In this work, we show that, in many different affiliation contexts (for binary or weighted graphs), parameter estimators based either on moment equations or on the maximization of some composite likelihood are strongly consistent and convergent in n square root, when the number n of nodes increases to infinity. As a consequence, our result establishes that the overall structure of an affiliation model can be (asymptotically) caught by the description of the network in terms of its number of triads (order 3 structures) and edges (order 2 structures). Moreover, these parameter estimates are either explicit (as for the moment estimators) or may be approximated by using a simple EM algorithm, whose convergence properties are known. We illustrate the efficiency of our method on simulated data and compare its performances with other existing procedures. A data set of cross-citations among economics journals is also analysed.
[世話人：下平英寿]
第02回平成24年7月23日(月) 16:30～18:30　於: 基礎工国際棟セミナー室

講演者： 高橋倫也氏（神戸大学海事科学部）
演　題：極値理論入門
講演者： Professor Laurens de Haan (Faculty of Economics, Erasmus University Rotterdam) 演　題： Estimation of the marginal expected shortfall
[世話人：清水泰隆]
第01回平成24年4月9日(月) 16:30～18:00　於: 基礎工国際棟セミナー室

講演者： 下平英寿氏（基礎工院統計数理講座）
演　題：ウェブや生命科学のネットワーク構造を明らかにする統計数理と並列計算
講演者： 清水泰隆氏（基礎工院数理計量ファイナンス講座）
演　題：保険数理への挑戦～高志乃水次郎長カナダを往く～
[世話人：狩野裕，熊谷悦生]