Honoring the Lion: A Festschrift for Jan de Leeuw

This special volume celebrates the 20th anniversary of the Journal of Statistical Software (JSS) and is a Festschrift for its founding editor Jan de Leeuw. Jan recently retired from his long-held position as founding chair of the Department of Statistics at the University of California, Los Angeles. The contributions to this special volume look back at some of his research interests and accomplishments during the half-century that he has been active in psychometrics and statistics. In this introduction, the guest editors also reminisce on their own ﬁrst encounters with Jan, ten years ago. Since that time JSS has solidiﬁed its place as a leading journal of computational statistics, a fact that has a lot to do with Jan’s stewardship. We include a brief history of JSS.


Introduction
This special volume has a dual purpose: it celebrates the first twenty years of the Journal of Statistical Software (JSS), and it acts as a Festschrift for Jan de Leeuw, who recently retired from his long-held position as founding chair of the Department of Statistics at the University of California, Los Angeles (UCLA). Since Jan also founded JSS, these celebrations have something to do with each other.
In Section 2, the special volume guest editors recount some personal recollections of Jan. In Section 3 the contributions to the special volume are discussed, along with some of the context of Jan's long career. We sketch a brief history of JSS in Section 4. Section 5 contains a few concluding remarks, and Appendix A presents the fortunesJdL package of collected Jan-isms prepared as an accompaniment to this introduction.

Some personal reflections
One of the benefits of editing a Festschrift is that the editors can selfishly share their personal stories with the readers, not subject to any further proof of truth. I (PM) met Jan de Leeuw at the useR! conference in Vienna in 2006. I was introduced to Jan by John Fox. Jan had plans to edit a JSS special volume on Psychometrics and John threw me into it by saying: "Jan, this is Patrick. He's working on a Rasch model package. Maybe something relevant for the special volume." Jan replied: "Sure, just submit." So much for that. The pressure was on since we did not have the parameter estimation fully working yet. For some forgotten, obscure reason I even ended up co-editing the special volume (De Leeuw and Mair 2007). In the introduction, Jan eloquently summarized my contribution: "Guest editor Patrick Mair reviewed all R code, including his own." Back to Vienna, 2006. The day after our succinct special volume conversation, Jan gave a presentation on his PsychoR project and laid out some ideas for corresponding R implementations. My boss at that time, Kurt Hornik, asked me: "Did you understand what he was talking about?" I mumbled something in the range between "somewhat" and "not really". Kurt likes this type of answer and replied with a crisp: "I thought you were a psychometrician". A couple weeks later, upon passing Kurt's office, he called to me: "You should go to LA and work with Jan on PsychoR. The department can't give you any money, Jan doesn't have money either." A few months later I was on the plane to LA, poor yet heavily armed with the Gifi book.
Jan took me to the faculty center for lunch right away. The first question he asked: "Are you a psychologist?" I replied "No, a statistician". I think this was the "right" answer: I sensed some kind of relief under his mustache. Still, the conversation did not particularly flow after that. I was too shy and Jan is not really the king of small talk.
However, from then on I was granted a weekly Tuesday audience when Jan came down from his ranch in the prairies of Cuddy Valley where he lived with his dogs. His answer to any question regularly consisted of a short matrix algebra dictate. He typically looked at some reference point close to the ceiling before taking off with "of course, K is a polyhedral convex cone of monotone transformations in R n . . . ". I took notes and tried to figure things out on my own (and often came back with the same stupid questions one week later). I also figured that his favorite word collocation has to be "of course".
Overall, working with Jan has been a tremendously productive and, for me at least, inspiring collaboration which continues to this day. I would like express my gratitude to Jan for introducing me to the fascinating world of scaling methods, as well as for his tireless support, kindness, and patience with my lack of understanding (still to this day) -and for being such a character.
Jan wrote on the R-help mailing list in 2005 that, "Bad English is the language of science." This may be true, but at the useR! 2006 conference in Vienna, I (KM) was interested in the practice of bad Dutch. I was learning the language at the Vrije Universiteit Amsterdam, and eager to practice at every opportunity. I spotted Jan's nametag with joy. This guy was obviously Dutch; a perfect victim. No matter that he was at UCLA. I introduced myself and in short order learned that he was, in fact, Dutch, but had long ago traded the bikepaths of the Netherlands for the freeways of Southern California. He showed me photos of the dry, grassy highlands that he called home: Cuddy Valley. It could not get more different from his sodden motherland.
It was a pretty small conference and I kept running into Jan. This was inevitable: we were both glued to our laptops and therefore competing for the same power outlets and wireless signal. I was trying to produce a revised Windows version of my package TIMP with some urgency for my colleagues the Netherlands; this was in the days before Uwe Ligges' Win-Builder service, when building Windows packages under Linux was more involved. Jan showed me his Apple laptop on which he virtualized Solaris, Windows, and some Linux variant. It was all very slick. I was impressed but slightly uneasy (what would Stallman think?). Jan is a pragmatic guy.
Despite being preoccupied with viewing all the World Cup games, Jan generously found some time to talk. We must have discussed my research, because I left the conference with an invitation to guest-edit a special volume in JSS' Foo-metrics in R series, on Spectroscopy and Chemometrics in R. He presented the plans for the special volumes in his keynote talk on Psychometrics in R, saying that the volumes would be in "psychometrics, econometrics, sociometrics and whatever else anyone suggests along these lines. Of course there is an inherent risk in actually making constructive suggestions -you may wind up to be a guest editor." So Jan did not set the bar for being a guest editor particularly high. He somewhat casually suggested that we should plan on publishing in six months. I was young and inexperienced in the realm of academic procrastination and moving deadlines, and took his words to mean that we would publish in six months; I badgered the contributors to the special volume write, review, and revise their papers "on time". Achim Zeileis contributed very quick technical editing. The special volume was born on schedule in January 2007. That turnaround now seems miraculous! That was just the start of my involvement with JSS and with Jan. In the years since I have also traded Dutch bikepaths for Southern California freeways, though now Jan has left his Cuddy Valley for the wet environs of Portland. Still, Jan seems always reachable via the Internet, just like JSS. I remain as inspired by his example of scholarship as I was in 2006. Heel hartelijk dank voor alles, Jan!

Outline of contributions
First of all we would like to thank all the Festschrift contributors for being part of this adventure. Instead of trying to agree on a very sophisticated, "flowing" order of the articles, the guest editors rely on artificial intelligence and let scaling techniques do the job.
We start with the extraction of the full plain text from the Festschrift contributions and play the usual text mining game (store as corpus and perform some basic cleanup in terms of removing punctuation, stopwords, convert to lowercase, etc.), using the tm package (Feinerer, Hornik, and Meyer 2008). The resulting corpus is provided in the supplementary materials. Figure 2 shows a word cloud produced with the wordcloud package (Fellows 2014), including terms with frequencies ≥ 10 only.
Based on the corpus we compute a document-term matrix (DTM) with the single contributions in the rows (labeled using the first author's name) and the terms in the columns.

R> dtm <-DocumentTermMatrix(jancorp)
The DTM is of dimension 8 × 4114. We reduce the number of terms by computing the term frequency-inverse document frequency (tf-idf) and keeping the 50% most important terms only.
Subsequently we compute the cosine dissimilarities among the contributions which gives us an 8 × 8 symmetric dissimilarity matrix subject to unidimensional scaling. Here we use a full permutation algorithm as described in Mair and De Leeuw (2015) and implemented in smacof (De Leeuw and Mair 2009a).
In order to get some insight into the topics covered in the submissions, we fit a latent Dirichlet allocation (LDA) using the topicmodels package (Grün and Hornik 2011). We fix the number of topics to K = 5 and use the 5% most important terms in the DTM only, as judged by the tf-idf. At the end we print out the five most important terms within each of the five topics.
R> fitca <-anacor(as.matrix(dtm2), ellipse = FALSE)  The resulting symmetric CA map is given in Figure 4. Note that instead of plotting all the 239 terms contained in the reduced DTM, we plot the five top words per topic only.
We see that the most extreme contributions in the CA space are those by John Fox and Allison Leanage on the one hand, and Don Ylvisaker on the other hand 1 . In fact, as described below, unlike the remaining contributions these two papers are not related to any of Jan's methodological developments. Therefore, we slightly overrule the unidimensional scaling solution and order the submissions as follows.
As pointed out in Section 4, one of Jan's main achievements is the founding of JSS. Fox and Leanage (2016) take us on a historical R/CRAN journey, trace the continuing development of R and CRAN in the pages of JSS, and tell us when we reach the asymptote of R package developments based on the fit of several growth models.
Pieter Kroonenberg is one of Jan's former Ph.D. students at Leiden University. In his very unique and entertaining style, Kroonenberg (2016) takes us back to the early days of the Department of Data Theory, introduces the reader to three-way data analysis, and recalls interactions with Jan. Things become particularly sparkling when Pieter tells us how Jan does book reviews overnight after "prolific drinking sessions".
Apart from developments in the area of multidimensional scaling (MDS), Jan is best known for being scientific father of the Gifi group, starting with his dissertation (De Leeuw 1973). This Dutch group of researchers at Leiden University established an "über-framework" of scaling methods for categorical data of tremendous flexibility (and an even more tremendous amount of Psychometrika publications throughout the 80s). Van der Heijden and Van Buuren (2016) grant us access to the Gifi universe and discuss the past, present, and future of the Gifi system. We are especially honored to have John Gower on board of this Festschrift. He is one of the key figures in early scaling developments and has a 60-year experience in the field of biometrics, (he started at Rothamsted in 1956 under Frank Yates). In his contribution (Gower 2016) he elaborates on the interaction between biometrics and psychometrics in relation to linear algebra and optimization, and puts Jan's methodological efforts into this context.
We are not sure whether "competition" is the right word to describe parallel developments in correspondence analysis (CA) of the French school (multiple CA; singular value decomposition at its core) and the Dutch school (homogeneity analysis; alternating least squares (ALS) optimization at its core). Husson, Josse, and Saporta (2016) from France juxtapose the two worlds, show connections, and elaborate on Jan's influence on the French school. At the end Gilbert Saporta wonders whether Jan tasted the non-vegetarian special sausage called "andouillette" at the 1976 Psychometric Society meeting in Grenoble.
Takane (2016) writes, "when he was young, Jan was saying that he would not publish a paper unless it was accepted essentially as was in the first round of review". With this brave attitude it is quite natural that some articles ended up being published as research reports only. Yoshio Takane, who is one of the earliest collaborators of Jan, picks up on some of Jan's lost papers in the 70s. He describes their collaboration (also involving Forrest Young) on the ALSOS project and describes exotic (lost) algorithms like ELEGANT, INDISCAL, and DEDICOM. At the end Yoshio brings up a problem Jan never replied to. The next move is yours, Jan! Groenen and Van de Velden (2016) review one of Jan's most important methodological developments: SMACOF. SMACOF represents a majorization approach to solve the MDS stress minimization problem. This general concept offers a vast amount of flexibility in terms of fitting all sorts of extended MDS models (three-way models, constraint MDS, unfolding, etc.) and is nowadays considered as the state-of-the-art approach in the MDS area.
According to his website, Jan's magnum opus ("when I was no longer young and wild"), was the establishment of the UCLA Department of Statistics. Don Ylvisaker was the key figure behind Jan's move from picturesque Leiden to megalopolis LA. He reflects on Jan's transition and how much effort it took him to establish a department which is nowadays one of the top-ranked in the world. Ylvisaker (2016) represents the grand finale of this Festschrift.  If I get generally positive reactions, I will start creating an editorial board, and I will start making room in our WWW.

A brief history of JSS
He mailed the list back the next day that he was, "absolutely buried under positive reactions, useful suggestions, and (even more importantly) offers to help out and tentative submissions." And so JSS was born. The new journal would be free for both readers and authors, and would harness the increasingly ubiquitous internet for essentially free, widespread distribution.
Less than a week later, the commercial publishing company, Birkhauser wrote Jan asking if he wanted to figure out a "mutually beneficial relationship". Jan managed to resist this and all subsequent attempts to commercialize JSS (though at least in the early days, he was willing to at least consider the idea of advertising, De Leeuw 1995b). He appreciated very early on that the Internet allows scholars who do so much of the work of typesetting, editing, and reviewing technical papers to put the works online themselves, cutting out the middleman, commercial publishers. The profits of scholarly publishing houses have long been obscene (McGuigan and Russell 2008); in 2011, operating profit margins of the largest academic publishers approached 40% (Harvie, Lightfoot, Lilley, and Weir 2013). University libraries must strain to purchase subscriptions to scholarly journals, and the public is largely locked out of the research results it funds. One can only imagine how different the situation would be (for both readers, authors, and libraries) if more academics followed the example of Jan, and set up their own non-profit scholarly "publishing houses".
That JSS was published (only) online was not its only innovative aspect. Source code for the statistical software discussed in the articles was also presented alongside the papers, available for download, testing, and extension. This was a major step forward in reproducible research. Also remarkable is that the code was reviewed alongside the paper. An early "Instructions to Authors" page cautioned, "Code should be readable, should have comments, and should not have hard-coded constants for particular applications or examples. We intend to be very critical here, because code has to work, has to be relatively pleasing to the eye, and has to be reasonably efficient. We do not insist on elegance and efficiency in the computer science sense, because we are statisticians and not computer scientists." JSS came out with its first three issues in 1996, and added one more issue to that first volume in 1997. Of the four contributions in the first volume, two described code in XLISP-Stat and two described code in S-PLUS. As the years marched along, the languages described in JSS would evolve (as described in Fox and Leanage 2016) and the pace of publication would increase. Despite these changes, the defining features of the journal would remain the same: It would be distributed via the Internet, provide peer-reviewed software, and be free of charge for both authors and readers.

Evolving article format, webpage, and copyright
JSS positioned itself on the frontier of open access publishing, reproducible research and review of statistical code, and in the early years the format of published articles was something of a wild, wild West. The "Instructions for Authors" circa 2002 stated, "Papers should be submitted in an Internet-friendly format. This could be HTML or Postscript. We prefer HTML, for instance as produced by L A T E X2HTML or RTF2HTML, or texi2html, with figures and formulas in either GIF or PS. Actually we prefer two versions: one in HTML browsing, and one in Postscript for downloading. It also makes sense to submit in T E X or L A T E X (we can translate it easily) or in WP/Word (we can translate it to Rich Text Format and then to HTML or to Postscript)." JSS papers lacked, in those early years, a uniform look and feel.
Fortunately, by April of 2004, Jan had a new sheriff in town with title "Technical Editor": Achim Zeileis. Zeileis set to work developing new L A T E X style files for JSS, and from volume 11 onward ensured that all JSS articles used them; there would be no more submissions accepted in formats other than L A T E X. Zeileis also developed and enforced a "house-style" that covered many details of article presentation, from the font-size used in graphics, to the use of italics, to the format of acronyms and their expansion, to the application of title style in references and sentence style in section headings, to the format of code, and so on and on. Zeileis was a brilliant "hire" by Jan, equipped with an ever-stricter parser for style conventions. His influence gave the journal a whole new aspect, which surely had a hand in the increasing reach and importance of the journal 2004-present.
The improvements in the format of articles were matched with improvements in the JSS website. Since the beginning, the website has featured Salvador Dalí's famous painting, The Persistence of Memory, which since the advent of JSS' style files has also been featured in thumbnail size on each article. In 2000, the domain http://www.jstatsoft.org/ began to be used; prior to that readers accessed the journal at http://www.stat.ucla.edu/journals/ jss/. From 1996-2007, the website took the form of a PHP application. In 2007 a new frontend appeared, written in Rails by UCLA Statistics staff member Jose Hales Garcia. In 2010, a backend was added to allow the editors to better manage the review process, also in Rails, and also by Hales Garcia. 2015 brought the largest website-related changes to date. The website and editorial back-end was moved from a server at UCLA to Universität Innsbruck, Austria, where Zeileis and his student Reto Stauffer led efforts to transition JSS to Open Journal Systems (Public Knowledge Project 2016), journal management software released under the GNU General Public License. Use of OJS eliminates a good deal of back-and-frontend programming work, adds features to help organize the review process, and generally facilitates journal management.
The copyright policy of JSS also evolved. From 1996-2006, the policy was that, "Authors keep the copyright. This means that as far as JSS is concerned, authors can publish their submissions elsewhere as well (if the other outlet does not forbid this)." From 1996-2002, the sentence, "We shall consider the appropriateness of the GPL (GNU Public License)" was also included. From 2003-2006, JSS was mum, at least publicly, on the issue of code copyrights. Starting in 2007, it was announced that, The Journal of Statistical Software has chosen to apply the Creative Commons Attribution License (CCAL) to all articles we publish in this journal. Under the CCAL, authors retain ownership of the copyright for their article, but authors allow anyone to download, reuse, reprint, modify, distribute, and/or copy articles in Journal of Statistical Software, so long as the original authors and source are credited. This broad license was developed to facilitate open access to, and free use of, original works of all types. Applying this standard license to your work will ensure your right to make your work freely and openly available. Code distributed with JSS articles will use the Creative Commons version of the GNU GPL.
In 2014, this policy was changed by a contentious vote of the editorial board to allow code to be released under any license that is GPL-compatible. Many on the editorial board felt this signaled a step away from advocacy of free software and support of the Free Software Foundation. However, those in favor of the spread of more permissive GPL-compatible licenses that allow released improved versions to be closed-source won out.

JSS' support system
At the time of JSS' inception, computational statisticians who spent a lot of time developing code were in need of new outlets to describe their efforts. "A more formal forum where statistical software is reviewed and acknowledged as a scientific work," as Martin Mächler put it in early deliberations in 1995, would have a ready audience of authors and readers. JSS was very well-positioned to serve this audience as reproducible research and open access publishing gained currency. This computationally-inclined audience was also well-positioned to serve JSS by putting somewhat more effort into typesetting their papers as compared to what was required by other, commercial journals.
UCLA lent the journal support from the beginning, both in tangible resources and also in boosting its legitimacy and visibility. The Department of Statistics provided programmers for front-and-backend development of the website, the server on which JSS was hosted, and graduate students to act as editorial assistants. Despite Jan's retirement in 2015, UCLA's support in terms of graduate student assistance continues.
Since the move of the journal server to Austria in 2015, Universität Innsbruck has begun providing much of the support formerly provided by UCLA. It now provides programmer assistance, the server on which JSS is hosted, as well as graduate student support. Further supporting the success of JSS is the army of Associate Editors and referees that donate their time to managing the review process. They ensure the high quality of articles that make it through the pipeline to publication.
The Foundation for Open Access Statistics (FOAS) is another bow in JSS' quiver of support. FOAS was established as a US-based nonprofit in 2012 with a mission to promote free software, open access publishing, and reproducible research in statistics. The only project financially supported by FOAS is JSS; it has allowed JSS to save a coffer of donations from the statistical computing community to act as a buffer during any changes in university support in the future.

JSS by the numbers
The output of JSS in terms of articles, code snippets, book reviews, and software reviews has risen over the years, as shown in Figure 8 (note that unlike in Fox and Leanage 2016, we count the publication year of an article as the date it became available on the JSS website, not the publication year of its associated volume).
As the number of publications has risen over the years, so has the impact factor of the journal. The SCImago Journal & Country Rank has listed JSS as the number one open access journal in "Statistics and Probability", and in the top ten "Statistics and Probability" journals overall for the last several years (Elsevier B.V. 2016b). Figure 9 shows three measures of journal impact between the years 1999-2014, (as of this writing, data for additional years was not available), using data from Journal Metrics (Elsevier B.V. 2016a). Source Normalized Impact per Paper (SNIP) is a measure of contextual citation impact in a given field (citations in fields that do not output large numbers of publications are weighted more). Impact per Publication (IPP) measures the ratio of citations in a year to papers published in the three previous years divided by the number of papers published in those same years. SCImago Journal Rank (SJR) weights citations according to the subject field, quality and reputation of the journal.
JSS has also seen marked increase in web-traffic over time. De Leeuw (2006) estimated that the JSS webpage received about 50,000 visits per year. By 2015, this had increased by a factor of more than eight, to 410,000 visits (where a single visit is defined as a unique visitor, singled out by their IP address and browser, browsing the site or downloading papers/software for up to three hours). Figure 10 shows full-text downloads per day November 2015-June 2016.

JSS outlook
Jan left JSS in the best shape it has ever been in. It has the support of three dedicated and talented Editors-in-Chief, namely Zeileis, Grün, and Pebesma, able to marshal resources at the Universität Innsbruck, Johannes Kepler Universität Linz, and the Universität Münster. FOAS has raised a fund to cover future financial contingencies, and provides a framework for the journal to raise further funds as needed. The move to the OJS platform means that many front-and-backend development needs are taken care of.
Side-effects of JSS' success include an ever-increasing number of submissions, and, as shown in Figure 8, a general trend toward more publications per year. This situation has put some stress on the editorial board. The use of OJS has streamlined the editorial process but the transition to this new platform also required the energy and attention of the Editors-in-Chief. As they managed the transition, the number of papers waiting for processing grew. The result has been longer turnaround times and a rather long publication delay for accepted manuscripts. This backlog is at present being whittled back down to size. It is expected that time-to-publication will again shorten in the year to come. As the journal moves through its 20th year the future looks bright indeed.

Concluding remarks
Jan's style has been to push a lot of information out into the world. He has been writing textbooks on statistics online since doing so became possible, and continues to do so to this day. His reports, notes, and papers fill volumes, all of which he has collected for our browsing pleasure. He is nothing if not generous regarding sharing his knowledge. This spirit of sharing scholarship with all interested is an essential element of what makes JSS tick, and it was inherited from Jan.
If one aspect of the many projects and collaborations that Jan has started is openness, the other is a certain open-endedness. Many of the collaborators in this special volume mention problems that they still want to work on with Jan, or in some cases, have Jan work on. Many of Jan's books online remain in draft form. There is and will remain much to be done. We wish Jan success in all of it, and speak for all the contributors to the special volume in saying we are grateful for the chance to have been a part of the journey thus far!