Google Analytics

Wednesday, October 2

Aritificial Alien Intelligence

Like fashion, even scientific research is subjected to cyclical fads. Topics, techniques come into limelight and then fall out of favor when the gains plateau. 

In machine learning, NNs were in vogue in early 90s when SVM came onto the scene and blew them out of water on almost every problem and then got superseded in turn by the log-linear models, CRFs and others. Then in beginning of this decade, NNs have made a tremendous come back, this time becoming the tool of choice for NLP as well.

Due to their name and a very crude similarity to how the biological brain is organized, NNs tend to invite comparisons with human brain and every improvement achieved with their help tends to be seen as an step towards Artificial General Intelligence, at least in popular media and common people. 

Now, what would happen if tomorrow a different method, a different paradigm of Machine Learning were start performing better than NNs? There is no reason why it could not happen AFAIK. We do not have theoretical results that say that NNs are better then every other method imaginable. They are as capable as anything else possible but that only means that another method that is better at exploiting the given data may be able to beat them.

Would such a development still feel like a step towards AGI? When you hear about a CRF, it merely sounds like a fancy mathematical object which we have devised. We don't feel threatened by it. It is because NNs are presented in the form of neurons and similarity to biological brains that the question seems to have assumed this sense of urgency. Or is it because for the first time, these systems have started to come close to human performance in certain tasks?

How do you think the discourse will change if a method that cannot be presented as interconnected neurons firing takes the lead? Would we call it Artificial Alien Intelligence? Or we are now past the possibility of such reversals or upheavals and the next jump up will come within the same paradigm only?

Monday, November 26

काठ की हांडी

बचपन में पढ़ा था
कि काठ की हांडी दोबारा नहीं चढ़ती
और दूध का जला छाछ भी फूंक फूंक कर पीता है।
काठ मारे हुए समाज की हांडी को
नफरत की कलई लगा कर
भावनाओं की आंच पर बार बार चढ़ाया जा सकता है।
बेकारी के दूध में
धर्म की अफीम मिला कर
कुंठित आशाओं के चावलों से
उन्माद की जहरीली खीर बार बार बनायी जा सकती है।
क्योंकि दूध का जला
छाछ भी फूंक फूंक कर पीना
जल्दी ही भूल जाता है।

Friday, March 4

The Burden of Taxpayers' Money

Waste of taxpayers' money is not a new refrain. It comes up in contexts ranging from big cars for legislators to foreign jaunts on flimsy grounds for MPs. From memorial parks to IITians doing MBA.

It is present with extraordinary force in the ongoing JNU row as well. Commentator after commentator have chided JNU students for wasting government money by engaging in politics. Having myself availed of significant government subsidy for my undergraduate education and also having been accused of engaging in politics in campus (poli-poli as it was called derisively), this made me think.

First let's not confuse income taxpayers with taxpayers. While about 3% of the Indian citizens pay income tax, everyone pays the indirect taxes like VAT, service tax etc. So JNU students are also taxpayers. Also worth keeping in mind that constitutional rights are not dependent on how much tax you pay.

Waste of taxpayers' money can mean one of the two things. It may mean that while government is supporting the students so that they can get a good education, they are instead engaging in politics and not focusing on their studies. But how do the commentators know that someone is engaging in politics at the expense of their work or studies? Did they check the semester records of the students involved in the protest? Have they checked the research records, teaching records of the professors? Will they be satisfied if someone with good academic performance was leading these protests?

And how can we decide if someone is paying sufficient attention? By looking at their CPIs? By looking at the number of papers they publish? Turning the question around, is it a waste of taxpayers' money if some of the students who focus only on their studies don't do well?

Another view says that it is a waste of taxpayers' money since these students are anti-national. Since they are questioning the elected government, judiciary and the constitution, why should we bear the burden to educate them? For starters, let me point out that we do spend taxpayers' money on educating all kinds of convicted criminals: murderers, thieves, rapists, even terrorists. Jails do not fund themselves.

But let's get back to the question of anti-national. It is clear that there is no violent action or criminal conspiracy involved here. The police action was in response to the sloganeering at an event. So we are in the territory of thought and speech crimes. Question is what makes you an anti-national? Is questioning the government of the day sedition? You can answer that one yourself.

How about questioning the court? Scholars like Arun Shourie have written entire books criticizing the judges for pushing forward the progressive agenda. Wasn't Manu Sharma let go by one of the very same courts and didn't the whole country came out of streets to protest against that? For the society to work smoothly, you need to follow the court orders. It doesn't mean you have to agree with them.

How about constitution? Let's just say that we have already hit a century of constitutional amendments and there seems no end in sight. BJP itself is against the section 370 of the constitution, demanding its removal. A complete review of Indian constitution has been in their election manifestos. And the last time NDA was in government, they constituted a National Commission to review the working of the Constitution. So much for not questioning the constitution.

So the question becomes that is someone under increased obligation to tow the state line if he/she receives state subsidy for education? Do they surrender their right to criticize the state? Are they not to have political opinions and certainly not to express them? Is the state funding of education a charity? Isn't it state's responsibility to make sure affordable education is available to everyone? Can state discriminate between someone who criticizes them vs someone who supports them?

And to turn the question around, are the standards of patriotism different for those who can afford to pay for their education?

For me personally, one of the most cherished experiences from college life is of participating in the small scale democratic experiment that IITK was. It was nothing compared to big universities and the boundaries were strictly set but even within those limits, it was an eye opener and a great teacher. It helped me to see myself as a part of the society, to engage with it, to care more deeply about its issues rather than being just an individual on a mission to make the greatest fortune for myself while having a shallow institute loyalty.

That experience convinced me that political awareness and engagement is part and parcel of a good education. A education that makes us a good citizen instead of just a taxpayer.

Tuesday, November 24


लम्बा शेर लिखोगे तो पछताओगे,
सुनो 'अभागा' ट्वीट नहीं कर पाओगे।

तेज़ हवा है, आग भड़कने वाली है,
अंदर बाहर कहीं रहो जल जाओगे।

पैने शब्दों की फसलों के मौसम में,
हर ज़बान से लहू टपकता पाओगे।

देशभक्ति के नए बने पैमाने पर,
मुंह खोला तो तुम भी तौले जाओगे।

Friday, October 30


अंधश्रद्धा को बताया प्यार उसने,
खेल जादू का दिखाया यार उसने।

चाँद को देखा तभी एक कौर निगला,
कर लिया यूँ उम्र का व्यापार उसने।

साथ में भूखा रहा वो आदमी कल,
नारी जाति का किया उद्धार उसने।

वो बहू अच्छी रही होगी 'अभागा',
सह लिया चुप रूढ़ियों का भार उसने।

Wednesday, September 16

Reading Translations

Sometime back on Twitter, in the course of a discussion on translations, someone mentioned that translations of foreign authors seem to find more favor with Indian readers as compared to translation of Indian authors from Indian languages. While many love writers like Orhan Pamuk (Turkish),
Gabriel Garcia Marquez (Spanish), not the same favor is accorded to Indian writers. There is a certain disinterest towards them. As is the wont of every Twitter discussion, someone soon mentioned "colonized minds" and that was that.
I'm not sure about the validity of original assertion, there are too many unverified assumptions here. How popular is Orphan Pamuk in India after all? But this discussion got me thinking about my own experience of reading translations. Do I have such a preference? I think I do.

Despite the multitude of languages, there is an underlying cultural continuity across India. Irrespective of the language of the story, if it is set in India, if it is about Indian society, you are likely to encounter familiar things, familiar situations in it. Reading about those familiar things in English often gets jarring. But when it comes to Turkey or Spain or Russia, I have little idea about the cultural idioms. So I can read peacefully in English without feeling awkward.

This problem is not limited to translations. I have run into the same problem with Indian English writing. "In Custody" of Anita Desai was ruined for me by the torturous passages upon passages trying to describe the beauty of Urdu poetry and the few translated couplets. I felt like shaking the author and asking her to just tell me the original Urdu couplet which I am fully capable of enjoying directly. I wish she'd have included those at the end of the book or in footnotes.

For the Indian books, my preference is to read the original if I can, read a Hindi translation or read an English translation if there is no other option. Although the sad fact is that even within India, quality and availability of English translation is often better than Hindi translations.

Tuesday, July 29

Travails of a Lazy Kannada Learner

Old Kannada inscription dated 981 CE in Vindyagiri hill at Shravanabelagola.jpg
"Old Kannada inscription dated 981 CE
in Vindyagiri hill at Shravanabelagola" 
Based on my efforts to learn Kannada, I have come to realize that there is a disconnect between the resources available and resources required for learning the language. While there is a steady stream of apps to help Kannada learners, they all seem to be based on the same basic pattern which I believe is not useful.

When learning Kannada, a speaker of another Indian language is presented with two difficulties. One is the language and other is the script. In the absence of reading, the sole source of language data being absorbed is spoken. With spoken data, the learner can neither control the speed of the incoming data nor can he reliably repeat it. This severely slows down the vocabulary acquisition process.

Without knowing the script, he cannot read the newspaper which is a good cheap source of plenty of reading material. Reading the newspaper in a language also provides a good sense of the environment in which the language operates, allowing one to connect with the pulse of the city. This is a very important part of taking language beyond a mere transnational medium with handful of people.

But without knowing the language, learning to read is also difficult. One can always start with the approach of mugging up the character shapes, ligatures, diacritic marks, but how do you know if you read something correctly since you do not know the language. Also, in the beginning the process is so slow that it is easy to give up. One can try reading the boards on the road but at the vehicle speeds, it is next to impossible (might work in certain areas of Bangalore like Hosur Road!). Even today, by the time I am able to read "Sarkari" on the school boards, the boards are well past. It seems silly, but it is a real problem!

So how do we break this cycle? From my personal experience, I can suggest a two pronged approach that helps start a positive reinforcing cycle.

For the problem of learning the script, I made progress by focusing on the proper nouns. Since names do not change between languages, we can leverage our prior knowledge to make this work. Place names and boards on shops make an obvious candidate but as I have described above, shop boards are not suitable for a sustained study in the beginning. I believe person names are a better source.

So I turned to Wikipedia. While Kannada Wikipedia is much smaller when compared to English, you can still find articles that are available in both. Out of these, the most useful are pages that list names in a particular category. I started with Presidents of USA, following it up with Chief Ministers of Karnataka and so on. In 1 week, I made more progress than I had ever made with all the books promising me to teach Kannada in 30 days.

I have a theory about why this works. A list of names of Presidents of United States of America is at a sweat spot between what you know and what you don't know. Some of the names are very common, so you can figure it out by only reading the first 1-2 characters. With others, like a uncommon middle name, you have to work till the end. So there is a good balance of challenge and reward.

Additionally, here the characters are repeated based on the usage frequency. So the most common characters and their common variants are reinforced and little effort is spent on characters not so common. Of course the frequency is a little off with English names and so I switched over to Kannada names afterwards. I now buy a Kannada newspaper once in a while and try reading it. To my surprise, I find that a lot of vocabulary is pretty close to Hindi/Sanskrit!

On to the language part. Most of the apps, classes and other resources available focus on teaching the spoken Kannada. As far as I know that is also the recommended way to approach a language. I tried books like Learn Kannada in 30 days, Rapidex English to Kannada, even an old book written by Rev. F. Ziegler meant for British officers. But none of them worked. I also tried bunch of Kannada learning apps. They also proved more or less ineffective for long term sustained learning. Over time, their interfaces are getting more polished but the underlying content or ideas are not evolving.

There were some common problems. All these resources were based on collection of sentences, conversations for use in different circumstances. As is bound to happen with any made up conversation, they were awkward, forced and became useless very fast in real world situations. None of them provided any instructions or pointers for further reading, additional vocabulary etc. Some of them had Kannada text printed is such bad quality that it is useless for a beginner.

One book that has helped me tremendously though is called Conversational Kannada. Although this book is also focused on spoken language, there are some crucial differences. The best thing is that instead of dividing the material as per situations, it divides and arranges it in the order of increasing grammatical and sentence construction complexity. For example, in the first 10 chapters, you work with sentences without any verbs. Each chapter also has a theme for the conversations as well as introduces the informal phrases that are used in that situation. All the Kannada text is in roman, so you can read it without knowing the script.

But there was still one problem when reading the above book. Beyond the sentences in the book, I had no reading material available in roman Kannada. While I was able to read the script by now, my speed was not fast enough to allow me to read fluently which is required for vocabulary building. Also I needed to control based on my current vocabulary. For this, I turned to Twitter.

With the ability to see the conversations, Twitter is a goldmine for a language learner. Plenty of people use roman Kannada on twitter. You can search for specific words and find conversations where that word is being used. Best part is that the conversations are natural, there is plenty of English mixed in allowing you to follow the conversations and more reading material is being created all the time. People are helpful, so you can even ask some of them what a word meant. I did this for many days. There are challenges around search - some words are common in other languages as well, some are spelled in multiple ways, some others may not have been used on Twitter in recent past. But overall, it was very helpful.

So combined with the above two, I am confident that I will be able to read some simple books in Kannada by the end of the year. Both the ideas that helped me can easily be packaged as apps and provided to the new learners. I did all the work manually - finding parallel Wikipedia pages in English and Kannada, searching the Twitter conversations. All this can be handled by a centralized service that can collect & process the data and then serve it to the learners based on their requirements. Based on my experience, it will prove to be a lot more effective than the cookie cutter apps that are available now.

Since then I have discovered one more resource that is Quilpad Switch. This tool allows you to view any webpage in the script of your choice. So now you can read any Kannada page in roman or in Devanagari script thus immediate access to tons of reading material. This may even reduce the need of the Wikipedia pages that I used. The requirement is now for a curator that can point out the texts based on the current vocabulary of the learner. Another idea ripe for making into an app. :)

Image by en:User:Dineshkannambadi Original uploader was Dineshkannambadi at en.wikipedia - Transferred from en.wikipedia; transferred to Commons by User:Papa November using CommonsHelper.(Original text : Uploaded by creator). Licensed under Creative Commons Attribution-Share Alike 3.0 via Wikimedia Commons -