Use Of Free and Open-Source Software In Teaching BI and Data Sciences: Bending the “Golden Rule” For The Sake of Wholesomeness

(Read the French version of this post)

An introduction of sorts

The main challenge, the so-called “golden rule” for BI- and Data Sciences-oriented degrees, is undeniably to train professionals that will be sought-after in the job market. Choosing the right IT tools to teach is one of the key factors , especially given there is a whole lot of choice out there: there is exotic software of which you can’t tell whether it is here to stay or should probably be used with caution; old software —perhaps too old for some— but which have been tried and tested over time; commercial software, both expensive and lousy or user-unfriendly, but which is nevertheless widespread, even ubiquitous and which, as a result —and against all logic,— manages to make itself count; free and open-source software, some of which highly appreciable, and yet they only enjoy a small market share amongst businesses. Overwhelmed with this wide range of choices, and having, in addition, to comply with bureaucracy and various budgetary and legal constraints, academics have to address the dilemma between the aforementioned goal on the one hand —which is to teach students software that is popular with businesses— with, on the other hand, the other purpose of universities as centres of excellence and temples of knowledge, which is to spread knowledge, good practices and the best methods resulting from the latest research.

In this essay, after we refer to two sources to present businesses’ actual preferences when it comes to BI and data sciences software and programming languages, we will then see how the sacrosanct “golden rule” has been being either drawn on certain occasions —e. g. to account for an unfair and anti-competitive “partnership” concluded in plain sight between the State administrations and one of the infamously aggressive commercial software vendors— or violated on other occasions, against all logic. To whose advantage? Well, the same ones’, it seems.

Knowing what companies want

Ricco Rakotomalala is rather well-known on the Web: academic and, since 2013, head of the Master SISE at the University Lyon 2, —a master’s degree which has its own YouTube channel, created in 2015, “Lyon Data Science,” which is pretty interesting,— also creator of free and open-source data mining teaching software (SIPINA, TANAGRA) and author of many tutorials and DRM-free ebooks.

In a January 2018 YouTube videodealing with teaching R and Python in Data Science degrees, Mr. Rakotomalala gives a lecture of more than one and a half hours which sprang up as a result of “an in-depth reflection on the key characteristics that teaching software should have”. The first of these characteristics is the “golden rule” which we have already presented in the introduction above.

The software taught in university courses must be that actually used in companies or, at the very least, resemble them.

The rationale behind this rule is clear: young graduates must be trained in the tools used by companies likely to hire them. So, how can we find out what companies want?

KDnuggets.com

For a modest university with a local scope, it is enough for its heads to get this kind of information from local companies; and even if their degree will quickly end up being locked into totally abusive “partnerships” with Microsoft, SAS or IBM, well, that’s how it goes: when you have to straddle the fence and regularly account for the budget granted to you by the State with decent hiring rates, all means are good.

Another idea may be to consult the results of a specialized survey, a complementary and thought-provoking source of information, especially if the survey covers other highly “computerized” countries of the world. One of the surveys on statistical analysis software and languages has been conducted every year in May for the past twenty years by the site KDnuggets.com. In his January 2018 video, Mr. Rakotomalala compares the results of 2017 (it will be 2019 for us) to those of 2005 and makes a few comments (listen to his explanations in the video of his conference, starting at 33:27):

  • https://www.kdnuggets.com/polls/2005/data_mining_tools.htm
    R, although not number one in the ranking, is extremely popular, including from Mr. Rakotomalala’s personal experience: most of the traffic drawn to his personal website consists of people coming for his R course materials and exercises. R has made an impressive climb in this ranking moving up from the bottom to second place (of the 2017 ranking). The speaker finds R’s evolution remarkable, which distinguishes it from other open-source statistical analysis tools that had the same genesis but remained on the margin: this was notably the case of open-source software of which Mr. Rakotomalala was the creator for one (TANGRA) and the contributor for the other (SIPINA). “R was devised as an open-source tool for academic research from the very beginning… In fact, there’s a NYTimes story [from 2009] that explains the origin of R, with Ross Ihaka and [Robert] Gentleman, where they explain the path and all that… it’s an academic work. I could have done the same thing. The key element is that this tool started being used by companies.” (48:23)
  • The evolution of the Python language has been even more impressive: more popular than R, more popular than anything else data sciences-related, Python has dominated this ranking for a few years now, while in 2005 this scripting language wasn’t even part of it.

We should also note that Excel was a sufficient piece of software for statisticians in the 1990s and 2000s and that it remained a must-have in 2005 (3rd place in the ranking). Today, however, although still a must-have (4th place), this software is no longer sufficient: there’s way too much data now to fit on a spreadsheet, the systems are interconnected and web-oriented, and many procedures are automated. One tool of choice in classical statistical studies, Excel is of no use in data sciences or machine learning.

r4stats.com

In order not to limit ourselves to the only argument of authority that are the results of the KDnuggets.com survey, here is another one: that of the r4stats.com blog: a personal (and presumably independent) site whose “mission is twofold:

  • to analyze the world of data science,
  • and to help people learn to use R,” —in a world dominated by the “big three” commercial software packages: SAS by SAS Institute, SPSS by IBM and Stata by StataCorp.

The author of this blog, created in 2012, is Bob Muenchen, an American university professor who

  • handles the research software contracts for The University of Tennessee (information provided in response to one of the blog readers);
  • is familiar with R as well as the aforementioned commercial software as he is the author of “R for SAS and SPSS Users” and “R for Stata Users” textbooks.

Now, what interests us most today, is that Mr. Muenchen compiles an annual ranking of the most popular software and languages used in data sciences, based on keywords taken from job postings published on job search site Indeed.com. The latest results of his ranking, which covers the period between February 2017 and May 2019, however different from the results of the KDnuggets.com survey,

  • still give Python the sweeping first place;
  • show that R is almost twice as widespread as SAS.

In 2012 Muenchen was already wondering whether 2015 might not be the “beginning of the end” of the dominance of SAS and SPSS commercial software, in favour of R (and not of Stata, another commercial software package, which at the time was the fastest growing). More certain about his assumption the following year, Muenchen corrected the situation and predicted in 2013 the beginning of the end of SAS and SPSS to start in 2014.

Forces et faiblesses

One cannot disagree with Mr. Rakotomalala that in the current context, which could be described as the “Big Data era” (Big Data is commonly referred to as a large volume of data which is varied, accumulated at high speed and which must be analyzed in real time), that R and Python really stand out and fully justify the investment that could be dedicated to them within the training courses.

The adoption of R and Python in data science degrees seems obvious. However, it is weaker for R. So let’s analyze its strengths and weaknesses.

R’s strengths

  • its open-source license complies with two fundamental characteristics which are :
    • access to the source code, guaranteeing us some control over the computing algorithms and operations actually performed in the background;
    • they can be used and exploited for free, whatever the context of use.
    • The source code (R, C, FORTRAN) is free of access and consultation, only the modification is subject to validation by the moderating body, the R Foundation.
    • What’s more, the source code is of quality, meaning that it is copiously accompanied by comments and respects certain conventions. In particular, the implemented instructions are accompanied by references to published scientific articles, offering the reader the possibility to compare the formulas in the research papers and the corresponding instructions implemented in the source code.
  • notorious computational engine (LAPACK) and used in other software on the market (Maple, MATLAB, Stata),
  • a whole lot of extensions — third-party packages,
  • a language that is easy to understand and learn; moreover, being close enough in its syntax to Python, R can both make learning Python easier for those who have never done it, and, conversely, be assimilated faster by someone who has prior working experience with Python.

“The first year I introduced Python for the bachelor’s degree [in the fall of 2015], the students then in Master’s degree came to me and said that they too would like to learn it… because they felt that everyone was talking about Python more and more, both on the Internet and in the press. These were people who had never worked with Python in their lives [but who already knew R]. So I offered them an improvised 6-hour training course on the last Friday before the Christmas break… For one, everyone showed up! The training session went very well, everyone was happy. I then asked everyone to do a project in Python —and not the easiest one— and they did very well. It shows that the gap between Python and R is not huge and that the students are able, after a training of only 6 hours, to make interesting projects —and with a minimal pedagogical cost: the programming framework is the same, the syntax is a bit different, the package names are different too, but the rest is the same.” (01:11:27)

  • a wealth of online documentation sources, official and third-party;
  • user-friendly, forum-like, bug tracker;
  • continuous improvement of the ecosystem: testing, identification and correction of bugs; performance improvement; addition of new features; correction of “cosmetic” aspects such as the quality of graphics rendering.

R’s weaknesses

(which are in fact never other than those of any software, open-source or commercial)

  • when an increased interest in a technology at a certain point in time is a fad (a famous speaker presented something, his demo was pretty convincing, then media from all over the world picked it up, feeding a months-long buzz);
  • when there are backwards compatibility problems (such as a R or a Python script that worked well a few years ago but no longer works today → solution: indicate in the script the version of R or Python for which the script was developed at the time);
  • when graphic libraries produce “exotic” or odd diagrams while one expects something more conventional or neat → solution: search the Internet to see if someone has had this problem before and how it was solved (for example, by using another library or another command), and if not, feel free to ask the question on the Internet yourself.
  • Problems may exist in connection with third-party packages but these are naturally subject to the laws of natural selection where the survival of a package depends on its real interest for the users: thus, a package rife with bugs and without real interest will not take off and will end up being abandoned even by those who had adopted it; whereas a package that is of interest to the community will see its issues quickly corrected by contributors from all over the world and will eventually outlive its creator. However, this maintenance mode has its limits: while problems deemed to be blocking (slow calculation or calculation inaccuracy) are corrected very quickly by the community of researchers because these aspects are of real interest to them, on the other hand, problems of a “cosmetic” nature (such as the quality of graphics rendering) or lack of user-friendliness take longer to resolve: Thus, the collaboration around a good IDE for R which is RStudio started only in 2010 and the first stable version was released only in 2016, more than twenty years after the creation of R by Ross Ihaka and Robert Gentleman.
  • no remuneration for package developers;
  • absence of a sales force as “aggressive” as that of SAS Institute or IBM (vendor of SPSS).

Let’s now assess the strengths and weaknesses of the commercial software SAS:

SAS’s strengths

  • SAS supporters usually recognize one strength to it —the only one, it seems— which is the reliability of the SAS computational engine, allegedly producing consistent resultats regardless of the processor architecture. This seems to be a self-proclaimed strength, however, because I haven’t found any documented evidence of exceptional and unparalleled performance or a written guarantee of the reliability of their computational engine —and Bob Muenchen says so, too, on his blog in response to one of his readers.

SAS’s weaknesses

  • cost prohibitive for smaller organizations;
  • in some cases inaccessible to individuals;
  • contracts with complicated and restrictive terms of use, with heavy legal sanctions;
  • closed source code (therefore unauditable);
  • a notoriously unpalatable programming language with no logic in its syntax. Unlike the R scripting language whose developers, it seems, kept in mind the idea of grammatical consistency, the SAS scripting language is so ridiculously inconsistent one may wonder whether it might have been developed by several researchers who perhaps never talked to each other and never got acquainted with other SAS commands than those they developed for themselves. The fact that SAS’ defects are not remedied over time says much about SAS Institute’s unwillingness to question itself and move toward more modern semantic paradigms, and its indifference towards the poor user experience with its software.

But let’s go back to the “in-depth reflection on the key characteristics that teaching software should have”.

Apart from the “golden rule”, any other recommendations?

Software should be able to store and preserve the analysis steps already completed so that the user does not have to repeat them each time the calculation program is opened.

We’re talking about a feature that allows the commands to be saved in a script file, which is well known and appreciated.

The way computing results are rendered to user who ordered them must be as clean and precise as the instructions he or she sent.

I’ll get to that point right below.

Conveying instructions to the computer (i.e. programming) must be done by means of verbal language (i.e. lines of code) and not by diagrams, drawings, or drag-and-drop or click-button behaviours.

The so-called simple interfaces, allowing programming by moving objects around on the screen, were popular at the dawn of the personal computer era in the 1990s, when the general public was barely getting acquainted with IT. Today, at the dawn of 2020s, this kind of approach, although extremely widespread, may only be good for teaching programming to young children. So, Tableau, if you can hear us

When several existing tools are available to solve a given problem, they should all be taught and students should be allowed to choose which one they prefer.

“In my econometrics course, I make my students do six 1-hour 45-minute practical sessions. In three of them I have them practice Excel —obviously because students will do their internships with Excel. Then another practical session in R, and a tutorial: the step-by-step guide I wrote for the session, with all major steps outlined; and in only 1 hour and 45 minutes, the students can learn R programming skills, so I don’t see what the difficulty is. Then I make them do a fifth session in Python in the same way. Finally, the last one, the sixth, is the test —and they are allowed to choose their tool. Every year, about half choose Excel, about a quarter choose R and a quarter choose Python.” (55:22)

The teaching of programming languages must be given as a dedicated subject, outside the subjects teaching theoretical methods.

The training must be optimally constructed in such a way that its essential parts —the theoretical course and the practical application on machines— do not cannibalize with one another but each have the training duration they deserve.

The software used in TD must be available for students to practice at home.

The barriers here are the cost of the software license and the compatibility of the software with the operating system installed on the student’s computer. Because even if the university pays the license fee to all its students, the installation can still be problematic.

  • First, installing SAS also requires installing virtualization software and setting up a virtual machine… a real pain for users who don’t like to solve technical problems but just want a tool that does the job out of the box.
  • Secondly, since SAS only exists for Windows, a student like me, who chose to run my computer with Linux, cannot install it, unless I use the SAS Web client (which I used when I had to) or use WINE or an emulator (brave people tried it, in 2009 and 2010, with more or less success). The most reasonable alternative appears to be installing R instead of SAS: R is a software that exists for Windows, Linux and MacOS (which comes from the Linux family) and can be installed without driving the user crazy.
The software or a programming language must be able to evolve organically to remain relevant, with new features popping up and evolving spontaneously under conditions that make possible for these to be peer-reviewed.

For example, for such a trendy issue as the “Big Data” —huge amounts of varied data accumulating at high speed in real time— to be addressed, modern statistical software must be able to do things it didn’t need to know how to do in the past, like supporting

“Thanks to the dynamism of R and the community [of users] that drives R forward, we have features that are really, really interesting,” says Ricco Rakotomalala (01:05:14).

The barriers here are

  • compatibility: when the development of new features can only be done on machines of a certain type or using a certain type of software;
  • cost: when development of new features requires relatively (or extremely) expensive software;
  • when the source code is the intellectual property of a person or a company, thus making access to it impossible, let alone editing thereof;

The best solution seems, once again, to opt for open-source software.

Curriculum needs to be adapted to the students’ skills.

“In L3 DSI [senior year of Bachelor’s degree in data sciences and computer science], I have a programming course that is complemented with an algorithmic course. We made the choice to work under Delphi. What’s very interesting is that twenty years ago Delphi was taught in M2 [senior year of Master’s degree]; we took it down to M1 and then to bachelor’s level. My data mining course, which I was doing in M2, now I’m doing it in M1. So there’s a real progression in students’ skills.” (01:06:18)

Curriculum needs to keep up with its time.

“What’s great about tcomputer science is that you can’t fall asleep: you have to watch what’s going on all the time because it’s changing so much. At one point, we wondered if it was still relevant to work with Delphi. Because if you go to the Apec [Apec.fr is one of France’s major job offers database for skilled white-collar talents] and run a search on the keyword “delphi”, you’ll still find a few job offers, we agree on that [the search returned 71 job offers from all over France, in contrast to the Python keyword, which returned 1,819 results]. Delphi was a very popular IDE fifteen years ago, but hey: it was fifteen years ago.” (01:07:04)

Tap into multiple sources for insights

Be it about including a new course on an interesting but still little-known piece of software or programming language; or about taking out of the curriculum a software tool or a programming language that has had its day: how do we know if the time is right? The best strategy appears to be tapping into multiple sources for insights. In this case, Mr. Rakotomalala explains that what led him to replace Delphi with Python in 2014 for the 2015 school year, was a combination of early warning signs:

  • Python was slowly getting to the top of the most popular languages on KDnuggets;
  • LeMondeInformatique.fr published an article entitled “Python goes to the top of machine learning languages and replaces Java”;
  • trend analysis results of job postings picked up on Apec.fr and elsewhere on the Internet.

Mr. Rakotomalala recounts how in 2017 a group of students from his master’s degree/programme did a text mining project on several hundred job offers from companies based all over France, written in French and found on the Web (LinkedIn, Apec.fr, etc.), which the students had first to read and tag by hand… The goal was to identify the most correlated keywords for job offers in “classic” statistics as opposed to those in data sciences. The results show that Python enjoys a rather singular place among them: while job offers for positions in “classic” statistics feature most often keywords like “statistics”, followed by (in no particular order here) “SAS”, “SAP”, “Excel” and names of databases, the picture however is different when it comes to job offers in data sciences: the most frequent keyword is still “statistics”, followed by “Python”, “R”, “machine learning”, “algorithm”, “SQL” and “English”.

Therefore, it is not surprising that in the decision tree of the predictive model automatically elaborated on the basis of these job offers, the first criteria allowing to sort out the job offers that are really data sciences related — is the word “Python”. Way more than a fad, skills in Python are truly part of companies’ needs.

Let’s have some fun and try the same experiment

I made a similar experiment out of curiosity in order to know how often the names of different software or programming languages taught today in several post-graduate degrees in data sciences and BI appear in recent job offers. So I launched a few search requests:

  • in late December 2019;
  • on two job search sites, Apec.fr and LinkedIn.com, that I chose arbitrarily but whose reputation and diversity of offers should arguably ensure representative and coherent results;
  • in three geographical areas: all of France, the Île-de-France region and finally the two departments of the former Nord—Pas-de-Calais region where two Master’s degrees offer training in BI, data sciences and machine learning: the SIAD master’s degree of the faculty of economic and social sciences, and the DS master’s degree of the faculty of science and technology;
  • by accompanying the keywords of interest with one of the two subsidiary keywords —“statistics” or “data”,— which are supposed to both refine the search and, being sufficiently generic, not to bias the results;
  • without specifying any other criteria.

And this is what we observed:

(It should be noted that the requests concerning R, which were initially intended to be included in this sample, had to be completely excluded from it because their results were systematically “polluted” by unrelated offers containing the acronym “R&D”. The work would have required further text-mining research, whose hourly volume would have been disproportional to the interest and the very raison d’être of this essay).

We can make several observations regarding the results:

  • LinkedIn systematically returned considerably more results than Apec.fr, the only exception being the “python statistics” request for the Nord and Pas-de-Calais departments;
  • with a few exceptions, searches containing the subsidiary keyword “data” tend to return more results than the searches containing the subsidiary keyword “statistic”, and this for both search engines;
  • a low number of results for two Microsoft technologies: VBA (Visual Basic Application) and Access:
    • VBA is the development environment created by Microsoft for the Visual Basic scripting language, also created by Microsoft, to program “macros” (automated procedures) for the Office package (another Microsoft creation): Word, Excel, PowerPoint and Access.
    • Access is the DBMS of Microsoft’s Office package, which was popular in the 1990s and 2000s for lack of other alternatives at the time, but from which market has started to lose interest since the beginning of the “Web 2.0” era and the massive arrival of all the various Web platforms allowing to store and edit data online, directly in the browser.

These are two examples of software whose teaching goes against the “golden rule” and which should in principle be removed from a Data Science curriculum. Especially given that the number of teaching hours in a master’s degree is limited and it is often difficult to add hours of teaching without having to remove others.

A conclusion of sorts

Being used in companies is, for a given IT tool, undoubtedly one of the determining factors in the choice of its integration in BI and data sciences oriented degree; it is indeed the “golden rule”, they say, among the choice criteria: the more widespread a software is, the more it must be adopted.

But is this always true?

How IBM has gained market share in Eastern Europe

Somebody told me how IBM manages to gain market shares in Eastern Europe which it then holds forever, as in a forced, life-long marriage. In those countries, major companies in the banking, finance and transport sectors (typical IBM clients in France) are not so big and usually do not have the means to afford the unbearable cost of IBM licenses and operating expenses. However, since large companies in neighboring countries are already customers of IBM and benefit from transaction facilities and other benefits, new customers agree more easily, which allows IBM to convert more and more companies and gain more and more market shares. These benefits, which are basically unnecessary and completely artificial, end up appearing essential. Just as ill-intentioned cigarette manufacturers used to make people believe, before cigarette advertising was completely banned in France, that smoking made people more popular and made their lives happier; and smokers, once addicted, no longer knew why they kept smoking every day, whereas at first, they only used to smoke socially.

Once the need is created, IBM sends in to these countries skillful salespeople who pull out all the stops and negotiate with large companies and public services using their usual rhetorics (“IBM is the gold standard of security and reliability”, “IBM is big and serious, and so are its customers”, “IBM is American”…). For the starters, they offer companies a trial period and a discount. It’s not yet a marriage but it is, so to speak, pretty much an engagement.

IBM software is designed to work best with IBM hardware, and companies that have started integrating their software are offered a trial of IBM hardware, again at a discount and even with payment facilities like a bank loan. Of course, the products they can afford themselves with their modest budget are not of the latest generation or are even completely obsolete. “But hey, this is still IBM stuff, we’ll be fine!,” think senior executives of the newly acquired IBM customers, without wondering about woes that staff members —who will have to work directly with that hardware and software— will have to endure every weekday.

Once the Mainframe servers are installed, the forced marriage is pronounced. Very quickly, the bill rises and, although the company will never stop regretting it afterwards, divorcing IBM would be even more expensive than continuing to incur huge operating and licensing costs. As a result, the company will not be able to afford an upgrade for years to come, condemning its staff to having to work with obsolete hardware and software. Furthermore, any possibility of compatibility with other software or hardware components will be eliminated: to be compatible with IBM products, a third-party hardware or software product must be IBM certified; and for this you must either pay IBM or be part of IBM. Needless to say that such practices make many people unhappy.

That’s the way IBM makes money, but chances are that this could just as well be the case for Microsoft: both

  • subject their customers to “shotgun wedding” contracts,
  • do rarely comply to world’s standards designed to allow for better compatibility and interoperability; still, those two always seek to reinvent their own wheel,
  • kill the tech industry with anti-competitive and unconstructive practices,
  • make big money without really contributing to innovation, but constantly arguing they are champions of progress.

While IBM and Microsoft software are of poor quality most of the time, it is still ubiquitous and makes itself count. Also, Microsoft has dangerously infected absolutely all nooks and crannies of France’s education and research ecosystem — some favours it does not deserve. How is that possible?

But you only have to listen to common sense to realize that their notoriety and market share is even more of a fake than Paris Hilton’s nanotechnological water (which at least is funny).

Wait… One cannot fool all the people all the time, right?

In our world, there are probably more believers or followers of a religion than there are atheists. Some countries have an “official” religion, while others, such as France, are lay. Which way is the most constructive? What does common sense say?

Religions, very much like the commercial software of certain predatory vendors, or the devices of certain predatory brands, can become perpetual sources of conflict and discomfort: to be happy with it, one must completely embrace it. They also have this in common that both allow for greater control over their users’ activities.

If we sought to measure how many people in the world have used Windows at least once during a given month, should we include people like me who, despite having stopped using Windows at home in favour of Ubuntu Linux since 2013, have regularly no other choice than to use Windows at work, in libraries, during practical session at university or when a friend asks me to fix a technical problem on his or her Windows PC.

Even though the number of people who have used Windows at least once in a month may represent, say, more than 90% of the world’s population, is that an indicator of its popularity? If we kept only those people who felt pleasure in using Windows or found it efficient, the rate thereby measured would quickly and brutally whittled away.

And yet, Windows is everywhere, like a fair share of other Microsoft software.

But if that defies all logic, how can that be?

Unfortunately, all too often, counter-intuitive choices in the matter of software used in companies are due to the fact that the decision-makers are not those for whom using that software is part of their job. In France, this issue can be met in most major State-run agencies and private companies —public services, schools and universities, large banking and insurance institutions. Companies that choose commercial software —which is expensive and often of poor quality, user-unfriendly and prone to compatibility issues,— shoot themselves at least three bullets in the foot:

  • they make their staff use poor quality software that creates problems and hampers their daily workflows;
  • spending large sums of money to pay for commercial software licences, their senior executives have to operate cost cuts elsewhere, often in hardware equipment, thus making their staff use old computers every weekday;
  • buying software or hardware from a predatory vendor such as Microsoft or IBM, their customers —among which the French State-run bodies— favour the status quo in the IT economy and feed a dangerous and anti-competitive oligopoly.

Bend that “golden rule”: Software and programming languages should be taught proportionally to their popularity among people who use them.

It is not a problem, per se, that commercial software such as SAS or Excel is taught in BI degrees: these two are indeed very widespread in companies, and students eager to be hired in IT contractors force providers should adopt them, just as university degrees should teach them to see as many of their young graduates recruited as quickly as possible. However, teaching major software must also be fair and proportional not only to their market share but also to their popularity among people who use them on a regular basis. And open source software should not be ignored.

Arguably, there is a problem when commercial software, which is widespread but poor in many ways, enjoys being software of choice in a training course, while, on the other hand, worldwide known and recognized open source software that has all the pros of its commercial rival without any of its flaws,— is no software of choice; or, at best, has to be taught as part of another course, —and even then, only by the personal will of a teacher who is more open than the others. Even more regrettable is the fact that teaching of outdated technologies such as Microsoft Access or VBA continues to be part of some curricula, thereby violating the “golden rule” and being completely irrational. It seems in the end that this golden rule is not a true golden one, and that it can be bent when needed.

Therefore, university professors had better regain their courage and their role as enlightened guides and creators of knowledge and, armed with the findings of their research, show the way to senior executives, whose job is not about thinking. All that is needed to capture their attention is to show them there is a good opportunity to cut costs.

It is clear that one cannot get large French companies and State-run bodies —many of which are today fully equipped with Microsoft software (Windows, Azure, Office, Outlook, Internet Explorer and so on) and IBM servers— to give it up overnight for open-source software; just as one cannot get a person fully equipped with Apple to adopt an Android-run smartphone. But that’s not a reason not to talk to them.

What, you think they won’t listen?

IT behemoths keep generating their needless, fake buzz and don’t care if anyone reads, watches or listens. As a result, their stuff is the one people hear the most.

Une réflexion sur “Use Of Free and Open-Source Software In Teaching BI and Data Sciences: Bending the “Golden Rule” For The Sake of Wholesomeness”

Votre commentaire

Entrez vos coordonnées ci-dessous ou cliquez sur une icône pour vous connecter:

Logo WordPress.com

Vous commentez à l’aide de votre compte WordPress.com. Déconnexion /  Changer )

Photo Google

Vous commentez à l’aide de votre compte Google. Déconnexion /  Changer )

Image Twitter

Vous commentez à l’aide de votre compte Twitter. Déconnexion /  Changer )

Photo Facebook

Vous commentez à l’aide de votre compte Facebook. Déconnexion /  Changer )

Connexion à %s