image/svg+xml Wikipedia mining of hidden links between political leaders Frahm, Jaffrès-Runser, Shepelyansky, Eur. Phys. J. B (2016) 89: 269 2013 Wikipedia edition Rankings of WorldUniversities About 20 different global universityrankings are listed in the Wikipediapage "College and university rankings" These rankings havean impact on scientificand educational policiesof governments Istherean universalranking withouta priori criteria andwithout cultural bias ? All these rankings are composite: Also, universities are preselected α + ... Composite score So let's shakeWikipedia !! Wikipedia Ranking of World UniversitiesConclusion WRWU is free from any cultural preferences since : - it takes into account many cultural points of view as we use all human knowledge contained in 24 Wikipedia language editions (17 millions Wiki articles) - these cultural points of view are treated on equal footing with the same statistical analysis (PageRank, CheiRank, 2DRank)WRWU measures academic excellence (top 10 and top 100are similar to ARWU) but also historic, social, or regionalimportance of universities. WRWU can be considered as complementary to already existing rankings such as ARWU, but in fact it encodesalready all existing rankings since Wikipedia containsinformation on it.Universal ranking ? Wikipedia Ranking of World Universities José Lages - Antoine Patt Institut UTINAMCNRSUniversité de Franche-Comté Laboratoire de Physique ThéoriqueCNRSUniversité de Toulouse Reference:J.L., A.P., D.S, The European Physical Journal B (2016) 89: 69 Dima Shepelyansky (Most of) human knowledgeis encoded in WikipediaEverybody use itat least as a first approach=First contact with a subjectAbout 40M wikipages280 language editions Wikipedia Ranking of World Universities From the Page/Chei/2DRanking of each ofthe 24 Wikipedia language editions, we extractthe rank index of pages devoted to "Universities"and we establish top 100 for each editionsand for each algorithms. Example: PageRank algorithm appliedto frwiki (March '13) gives the following top 3:1- K=904 "Université de Harvard"2- K=1549 "Ecole Polythechnique"3- K=1558 "Université d'Oxford" Wikipedia PageRankingof World UniversitiesWRWU 1st University of Cambridge 2nd University of Oxford 3rd Harvard University 4th Columbia University 5th Princeton University 6th MIT 7th University of Chicago 8th Stanford University 9th Yale University 10th University of California, Berkeley Academic Ranking of World UniversitiesARWU ("Shanghai ranking" 2013) 1st Harvard University (-2) 2nd Stanford University (-6) 3rd University of California, Berkeley (-7) 4th MIT (-2) 5th University of Cambridge (+4) 6th California Institute of Technology (-22) 7th Princeton University (+2) 8th Columbia University (+4) 9th University of Chicago (+2)10th University of Oxford (+8) 90% overlapbetween top 10sWRWU and ARWU60% overlapbetween top 100sWRWU and ARWU Definitively, as ARWU, WRWU measures academic excellence, but not only ... Oxbridge at the top of WRWUfollowed by US major universities Marketwatch, December 10, '15 Overlap WRWU/ARWU Overlap enwiki/ARWU Overlap frwiki/ARWU Overlap dewiki/ARWU Results: 1024 ranked universites with PageRank algorithm,1378 with CheiRank algorithm and 1559 with 2DRank algorithmUniversities from 142 countries. WPRWU index WCRWU index More communicative Approximative balance betweeninfluence and communicative propertiesof universities in Wikipedia Less communicative 1 12 24 36 48 59 71 83 95 107 118 Geographical distribution of universities in WRWU As in the other rankings such as ARWU,US universities still dominate,BUT ... 0 5 10 15 20 25 30 35 40 US DE UK FR JP SE CH IT NL PL CA CN RU AT CZ DK EE EG FI IE IL NO PT Foundation country # of universities Wikipedia PageRank top 100 universities (WPRWU) 0 10 20 30 40 50 60 US UK AU CA CH DE FR IL JP NL SE DK BE FI NO RU Foundation country # of universities Shanghai top 100 universities (ARWU) ... less US universitiesand more european universities among top 100, ... ... also more older universities among top 100. 0 10 20 30 40 50 11th 12th 13th 14th 15th 16th 17th 18th 19th 20th Foundation century # of universities ARWU top100 WRWU top100 25 11 University of Bologna XIth century 97 13 University of Coimbra69 13 University of Padua XIIIth century 33 14 Charles University in Prague65 14 Jagiellonian University51 14 Sapienza University of Rome21 14 University of Vienna XIVth century 26 15 Leipzig University59 15 University of Glasgow92 15 University of St Andrews64 15 University of Tübingen XVth century 90 16 Martin Luther University of Halle-Wittenberg80 16 Trinity College, Dublin75 16 University of Jena XVIth century 32 17 Lund University93 17 University of Amsterdam83 17 University of Tartu XVIIth century 38 18 École Polytechnique56 18 Georgetown University66 18 Saint Petersburg State University22 18 University of Göttingen99 18 University of Wrocław XVIIIth century 11 19 Humboldt University of Berlin98 19 Indiana University76 19 Keio University23 19 London School of Economics61 19 Peking University39 19 University of Bonn95 19 University of Notre Dame45 19 University of Virginia86 19 University of Warsaw71 19 Waseda University XIXth century 43 20 Al-Azhar University67 20 Free University of Berlin85 20 Institut Polytechnique des Sciences Avancées96 20 Technical University of Berlin91 20 Tsinghua University100 20 University of Hamburg XXth century "Newcomers" in top 100 These universities are important by their historical, social, or regional impact. First universities Emergence of US universities Dominance of European universities Dominance of US universities Emergence of Asian universities Evolution through centuries What's next ? dominance of Asian universities ? Quite rigid clubof first universities"not willing" to acceptnew members Universities in Top 5were founded before XIXth century Universities in Top 43were founded before XXth century Distribution over foundation centuryfor PageRank top 100s wikieditions Distribution over foundation centuryfor CheiRank top 100s wikieditions from PageRank from CheiRank Network of cultures An arrow points from a culture A to a culture B,the width of the arrow is proportional to the number of university of culture Bappearing in the ranking of culture A Cultures in (K,K*) plane 24 Wikipedia language editionscovering 59% of world populationand 68% total Wikipages EditionLanguage N EditionLanguage N ENEnglish4212493VIVietnamese594089 DEGerman1532978FAPersian295696 FRFrench1352825HUHungarian235212 NLDutch1144615KOKorean231959 ITItalian1017953TRTurkish206311 ESSpanish974025ARArabic203328 RURussian966284MSMalaysian180886 PLPolish949153DADanish175228 JAJapanese852087HEHebrew144959 SVSwedish780872HIHindi96869 PTPortuguese758227ELGreek82563 ZHChinese663485THThai78953 About 17M wikipages considered(March '13) Each Wikipedia edition is treated asa complex network - Google matrix method: the example of the Wikipedia Ranking of World Universities- Reduced Google matrix method: the example of protein-protein interaction network Applications of the (reduced) Google matrixmethod to complex network analysis José Lages Institut UTINAM / OSU THETA / CNRS / Université de Bourgogne-Franche-Comté Collaborators: Dima Shepelyansky (Laboratoire de Physique Théorique de Toulouse / UPS / CNRS) Klaus Frahm (Laboratoire de Physique Théorique de Toulouse / UPS / CNRS) Andrei Zinovyev (Institut Curie / PSL / Inserm) Célestin Coquidé (Institut UTINAM / UBFC / CNRS) Guillaume Rollin (Institut UTINAM / UBFC / CNRS) Antoine Patt (UBFC) Project APEX "kick-off" meeting, October 20th 2017 How Google works From Markov (1906) to Brin & Page (1998) Markovian process : a random surfer probe the structure of a directed network. A each step, the surfer choose randomly an adjacent node to hop and continue its journey. Adjacency matrix Stochastic matrix /N Google matrix Perron-Frobenius operator PageRank algorithm is at the heart ofsearch engine(Brin & Page '98) Google Matrix analysis Adjacency matrix node j outdegree node j indegree For a given wikiedition, we rank all the pages using:PageRank algorithmMore a given page is pointed by important pagesmore this page is important(Measure of influence) Stochastic matrix Google matrix For a given wikiedition, we rank all the pages using:CheiRank algorithmMore a given page points to important pagesmore this page is important(Measure of communicativity) Stochastic matrix Google matrix (Chepelianskii '10, Ermann et al. '12, Zhirov et al. '10) For a given wikiedition, we rank all the pages using:2DRank algorithm(Measure influence and communicativity) (Zhirov et al. '10) Reduced Google matrix Consider a network with nodes. Consider a sub-network (a community) of nodes. Google matrix of the N size network and the associated PageRank vector can be written We define the reduced Google matrix associated to the size community such as The reduced Google matrix can be written Contribution from direct links Contribution from indirect links (scattering term) Very slow convergence since the eigenvalue of is very close to 1. Contribution from direct links Contribution from hidden links Contribution from « PageRank » Contribution from direct links Contribution from hidden links Wikipedia mining of hidden links between political leaders Frahm, Jaffrès-Runser, Shepelyansky, Eur. Phys. J. B (2016) 89: 269 2013 Wikipedia edition Contribution from direct links Contribution from hidden links Wikipedia mining of hidden links between political leaders Frahm, Jaffrès-Runser, Shepelyansky, Eur. Phys. J. B (2016) 89: 269 2013 Wikipedia edition Circles of influence Contribution from direct links Contribution from hidden links Genes of a proliferative signature resulted from pancancer transcriptomic analysis More genes are connected into the network Emergence of a new “hidden” hub BUB1 Connection to PCNA (DNA replication and DNA repair) Many cell cycle proteins improves in PageRank (AURK) Connection between STIL (mitotic spindle checkpoint regulator) and CCNA2, CCNE1 Reduced Google Matrix in Directed Biological Networks Reference:J.L., D.S., A.Z., bioRxiv, , soumis à PLOS ONE José Lages Institut UTINAMCNRSUniversité de Franche-Comté Laboratoire de Physique ThéoriqueCNRSUniversité de Toulouse Dima Shepelyansky Andrei Zinovyev Institut CURIEInsermPSL Complete ranking at: direct indirect “hidden” TRN1 (“normal”) TRN2 (“cancer”) B E G PageRank goes down PageRank improves Normal B-Lymphocytes Leukemia cell line indirect signaling rewiring Comparing two TRN networks:e.g., "normal" vs "cancer"(results presented at the 13th [BC]2 Basel Computational Biology Conference, 2017 ) Inferring indirect (hidden) causal connections between AKT-mTOR pathway members Inferring indirect (hidden) causal connections between pathway members P R K A ( A M P s e n s o r ) s i g n a l l i n g t o A K T 1 H i d d e n a p o p t o s i s p a t h w a y Genes of a proliferative signature resulted from pancancer transcriptomic analysis Mercipourvotreattention Big Data seen as directed networks Non exhaustive list of applications Data Nodes Links between nodes WWW Web pages Hyperliens Wikipedia Wiki articles Citations intra-wiki Twitter / social networks Members Follow relations World Trade (from WTO, UN, OECD, ...) Goods x countries Economical balance between countries Omics Proteins Inhibition/activation Linux Kernel commands Command succession DNA Pattern Pattern successions Brain activity Neurons Synaptic connections Go game Played patterns Pattern sucessions enwiki Aug. ‘09 Cambridge ‘06 Oxford ‘06 Cambridge ‘06 G G* Real directed complex networks Properties of real networks (WWW, social networks, ...) Small world” property: average distance between two nodes Scale-free property: distribution of the number of ingoing or outgoing links is PageRank distribution properties Google matrix G of the English Wikipedia network (Aug. ‘09) N=3 282 257 Figures from : Ermann, Frahm, Shepelyansky (2016), Scholarpedia, 11(11):30944 Google matrix spectrum PageRank distribution properties Theory of real directed complex networks Random Matrix Theory introduced by Wigner ‘67 describes universal spectral properties shared by complex nuclei/atoms/molecules and also mesoscopic and quantum chaos systems (Hermitean and unitary matrices statistical ensembles). Challenge : A Random Matrix Theory for Markov chains and Google matrix ensembles is still lacking We need more examples / more applications Wikipedia english ‘13 network Taken from https :// Best viewed in full screen(Press key)Please, use arrows to navigate
