## December 31, 2007

### SRILM ngram smoothing notes

This is a unix man page I wrote explaining the smoothing methods used in the SRILM statistical language modeling toolkit.

# ngram-discount

## NAME

ngram-discount - notes on the N-gram smoothing implementations in SRILM

## NOTATION

a_z
An N-gram where a is the first word, z is the last word, and "_" represents 0 or more words in between.
p(a_z)
The estimated conditional probability of the nth word z given the first n-1 words (a_) of an N-gram.
a_
The n-1 word prefix of the N-gram a_z.
_z
The n-1 word suffix of the N-gram a_z.
c(a_z)
The count of N-gram a_z in the training data.
n(*_z)
The number of unique N-grams that match a given pattern. "(*)" represents a wildcard matching a single word.
n1,n[1]
The number of unique N-grams with count = 1.

## DESCRIPTION

N-gram models try to estimate the probability of a word z in the context of the previous n-1 words (a_), i.e., Pr(z|a_). We will denote this conditional probability using p(a_z) for convenience. One way to estimate p(a_z) is to look at the number of times word z has followed the previous n-1 words (a_):

(1) p(a_z) = c(a_z)/c(a_)
This is known as the maximum likelihood (ML) estimate. Unfortunately it does not work very well because it assigns zero probability to N-grams that have not been observed in the training data. To avoid the zero probabilities, we take some probability mass from the observed N-grams and distribute it to unobserved N-grams. Such redistribution is known as smoothing or discounting.

Most existing smoothing algorithms can be described by the following equation:

(2) p(a_z) = (c(a_z) > 0) ? f(a_z) : bow(a_) p(_z)
If the N-gram a_z has been observed in the training data, we use the distribution f(a_z). Typically f(a_z) is discounted to be less than the ML estimate so we have some leftover probability for the z words unseen in the context (a_). Different algorithms mainly differ on how they discount the ML estimate to get f(a_z).

If the N-gram a_z has not been observed in the training data, we use the lower order distribution p(_z). If the context has never been observed (c(a_) = 0), we can use the lower order distribution directly (bow(a_) = 1). Otherwise we need to compute a backoff weight (bow) to make sure probabilities are normalized:

 Sum_z p(a_z) = 1

Let Z be the set of all words in the vocabulary, Z0 be the set of all words with c(a_z) = 0, and Z1 be the set of all words with c(a_z) > 0. Given f(a_z), bow(a_) can be determined as follows:

(3) Sum_Z  p(a_z) = 1 Sum_Z1 f(a_z) + Sum_Z0 bow(a_) p(_z) = 1 bow(a_) = (1 - Sum_Z1 f(a_z)) / Sum_Z0 p(_z)         = (1 - Sum_Z1 f(a_z)) / (1 - Sum_Z1 p(_z))         = (1 - Sum_Z1 f(a_z)) / (1 - Sum_Z1 f(_z))

Smoothing is generally done in one of two ways. The backoff models compute p(a_z) based on the N-gram counts c(a_z) when c(a_z) > 0, and only consider lower order counts c(_z) when c(a_z) = 0. Interpolated models take lower order counts into account when c(a_z) > 0 as well. A common way to express an interpolated model is:

(4) p(a_z) = g(a_z) + bow(a_) p(_z)
Where g(a_z) = 0 when c(a_z) = 0 and it is discounted to be less than the ML estimate when c(a_z) > 0 to reserve some probability mass for the unseen z words. Given g(a_z), bow(a_) can be determined as follows:
(5) Sum_Z  p(a_z) = 1 Sum_Z1 g(a_z) + Sum_Z bow(a_) p(_z) = 1 bow(a_) = 1 - Sum_Z1 g(a_z)

An interpolated model can also be expressed in the form of equation (2), which is the way it is represented in the ARPA format model files in SRILM:

(6) f(a_z) = g(a_z) + bow(a_) p(_z) p(a_z) = (c(a_z) > 0) ? f(a_z) : bow(a_) p(_z)

Most algorithms in SRILM have both backoff and interpolated versions. Empirically, interpolated algorithms usually do better than the backoff ones, and Kneser-Ney does better than others.

## OPTIONS

This section describes the formulation of each discounting option in ngram-count(1). After giving the motivation for each discounting method, we will give expressions for f(a_z) and bow(a_) of Equation 2 in terms of the counts. Note that some counts may not be included in the model file because of the -gtmin options; see Warning 4 in the next section.

Backoff versions are the default but interpolated versions of most models are available using the -interpolate option. In this case we will express g(a_z) and bow(a_) of Equation 4 in terms of the counts as well. Note that the ARPA format model files store the interpolated models and the backoff models the same way using f(a_z) and bow(a_); see Warning 3 in the next section. The conversion between backoff and interpolated formulations is given in Equation 6.

The discounting options may be followed by a digit (1-9) to indicate that only specific N-gram orders be affected. See ngram-count(1) for more details.

-cdiscount D
Ney's absolute discounting using D as the constant to subtract. D should be between 0 and 1. If Z1 is the set of all words z with c(a_z) > 0:
 f(a_z)  = (c(a_z) - D) / c(a_) p(a_z)  = (c(a_z) > 0) ? f(a_z) : bow(a_) p(_z) ;Eq.2 bow(a_) = (1-Sum_Z1 f(a_z)) / (1-Sum_Z1 f(_z))  ;Eq.3
With the -interpolate option we have:
 g(a_z)  = max(0, c(a_z) - D) / c(a_) p(a_z)  = g(a_z) + bow(a_) p(_z) ;Eq.4 bow(a_) = 1 - Sum_Z1 g(a_z)      ;Eq.5         = D n(a_*) / c(a_)
The suggested discount factor is:
 D = n1 / (n1 + 2*n2)
where n1 and n2 are the total number of N-grams with exactly one and two counts, respectively. Different discounting constants can be specified for different N-gram orders using options -cdiscount1, -cdiscount2, etc.
-kndiscount and -ukndiscount
Kneser-Ney discounting. This is similar to absolute discounting in that the discounted probability is computed by subtracting a constant D from the N-gram count. The options -kndiscount and -ukndiscount differ as to how this constant is computed.
The main idea of Kneser-Ney is to use a modified probability estimate for lower order N-grams used for backoff. Specifically, the modified probability for a lower order N-gram is taken to be proportional to the number of unique words that precede it in the training data. With discounting and normalization we get:
 f(a_z) = (c(a_z) - D0) / c(a_)  ;; for highest order f(_z)  = (n(*_z) - D1) / n(*_*) ;; for lower orders
where the n(*_z) notation represents the number of unique N-grams that match a given pattern with (*) used as a wildcard for a single word. D0 and D1 represent two different discounting constants, as each N-gram order uses a different discounting constant. The resulting conditional probability and the backoff weight is calculated as given in equations (2) and (3):
 p(a_z)  = (c(a_z) > 0) ? f(a_z) : bow(a_) p(_z) ;Eq.2 bow(a_) = (1-Sum_Z1 f(a_z)) / (1-Sum_Z1 f(_z))  ;Eq.3
The option -interpolate is used to create the interpolated versions of -kndiscount and -ukndiscount. In this case we have:
 p(a_z) = g(a_z) + bow(a_) p(_z)  ;Eq.4
Let Z1 be the set {z: c(a_z) > 0}. For highest order N-grams we have:
 g(a_z)  = max(0, c(a_z) - D) / c(a_) bow(a_) = 1 - Sum_Z1 g(a_z)         = 1 - Sum_Z1 c(a_z) / c(a_) + Sum_Z1 D / c(a_)         = D n(a_*) / c(a_)
Let Z2 be the set {z: n(*_z) > 0}. For lower order N-grams we have:
 g(_z)  = max(0, n(*_z) - D) / n(*_*) bow(_) = 1 - Sum_Z2 g(_z)        = 1 - Sum_Z2 n(*_z) / n(*_*) + Sum_Z2 D / n(*_*)        = D n(_*) / n(*_*)
The original Kneser-Ney discounting (-ukndiscount) uses one discounting constant for each N-gram order. These constants are estimated as
 D = n1 / (n1 + 2*n2)
where n1 and n2 are the total number of N-grams with exactly one and two counts, respectively.
Chen and Goodman's modified Kneser-Ney discounting (-kndiscount) uses three discounting constants for each N-gram order, one for one-count N-grams, one for two-count N-grams, and one for three-plus-count N-grams:
 Y   = n1/(n1+2*n2) D1  = 1 - 2Y(n2/n1) D2  = 2 - 3Y(n3/n2) D3+ = 3 - 4Y(n4/n3)
Warning:
SRILM implements Kneser-Ney discounting by actually modifying the counts of the lower order N-grams. Thus, when the -write option is used to write the counts with -kndiscount or -ukndiscount, only the highest order N-grams and N-grams that start with <s> will have their regular counts c(a_z), all others will have the modified counts n(*_z) instead. See Warning 2 in the next section.
-wbdiscount
Witten-Bell discounting. The intuition is that the weight given to the lower order model should be proportional to the probability of observing an unseen word in the current context (a_). Witten-Bell computes this weight as:
 bow(a_) = n(a_*) / (n(a_*) + c(a_))
Here n(a_*) represents the number of unique words following the context (a_) in the training data. Witten-Bell is originally an interpolated discounting method. So with the -interpolate option we get:
 g(a_z) = c(a_z) / (n(a_*) + c(a_)) p(a_z) = g(a_z) + bow(a_) p(_z)   ;Eq.4
Without the -interpolate option we have the backoff version which is implemented by taking f(a_z) to be the same as the interpolated g(a_z).
 f(a_z)  = c(a_z) / (n(a_*) + c(a_)) p(a_z)  = (c(a_z) > 0) ? f(a_z) : bow(a_) p(_z) ;Eq.2 bow(a_) = (1-Sum_Z1 f(a_z)) / (1-Sum_Z1 f(_z))  ;Eq.3
-ndiscount
Ristad's natural discounting law. See Ristad's technical report "A natural law of succession" for a justification of the discounting factor. The -interpolate option has no effect, only a backoff version has been implemented.
         c(a_z)  c(a_) (c(a_) + 1) + n(a_*) (1 - n(a_*)) f(a_z)= ------  ---------------------------------------         c(a_)        c(a_)^2 + c(a_) + 2 n(a_*) p(a_z)  = (c(a_z) > 0) ? f(a_z) : bow(a_) p(_z) ;Eq.2 bow(a_) = (1-Sum_Z1 f(a_z)) / (1-Sum_Z1 f(_z))  ;Eq.3
-count-lm
Estimate a count-based interpolated LM using Jelinek-Mercer smoothing (Chen & Goodman, 1998), also known as "deleted interpolation." Note that this does not produce a backoff model; instead of count-LM parameter file in the format described in ngram(1) needs to be specified using -init-lm, and a reestimated file in the same format is produced. In the process, the mixture weights that interpolate the ML estimates at all levels of N-grams are estimated using an expectation-maximization (EM) algorithm. The options -em-iters and -em-delta control termination of the EM algorithm. Note that the N-gram counts used to estimate the maximum-likelihood estimates are specified in the -init-lm model file. The counts specified with -read or -text are used only to estimate the interpolation weights.
Smooth by adding D to each N-gram count. This is usually a poor smoothing method, included mainly for instructional purposes.
 p(a_z) = (c(a_z) + D) / (c(a_) + D n(*))
default
If the user does not specify any discounting options, ngram-count uses Good-Turing discounting (aka Katz smoothing) by default. The Good-Turing estimate states that for any N-gram that occurs r times, we should pretend that it occurs r' times where
 r' = (r+1) n[r+1]/n[r]
Here n[r] is the number of N-grams that occur exactly r times in the training data.
Large counts are taken to be reliable, thus they are not subject to any discounting. By default unigram counts larger than 1 and other N-gram counts larger than 7 are taken to be reliable and maximum likelihood estimates are used. These limits can be modified using the -gtnmax options.
 f(a_z) = (c(a_z) / c(a_))  if c(a_z) > gtmax
The lower counts are discounted proportional to the Good-Turing estimate with a small correction A to account for the high-count N-grams not being discounted. If 1 <= c(a_z) <= gtmax:
                   n[gtmax + 1]  A = (gtmax + 1) --------------                      n[1]                          n[c(a_z) + 1]  c'(a_z) = (c(a_z) + 1) ---------------                            n[c(a_z)]            c(a_z)   (c'(a_z) / c(a_z) - A)  f(a_z) = --------  ----------------------             c(a_)         (1 - A)
The -interpolate option has no effect in this case, only a backoff version has been implemented, thus:
 p(a_z)  = (c(a_z) > 0) ? f(a_z) : bow(a_) p(_z)  ;Eq.2 bow(a_) = (1-Sum_Z1 f(a_z)) / (1-Sum_Z1 f(_z))   ;Eq.3

## FILE FORMATS

SRILM can generate simple N-gram counts from plain text files with the following command:
 ngram-count -order N -text file.txt -write file.cnt
The -order option determines the maximum length of the N-grams. The file file.txt should contain one sentence per line with tokens separated by whitespace. The output file.cnt contains the N-gram tokens followed by a tab and a count on each line:
 a_z <tab> c(a_z)
A couple of warnings:
Warning 1
SRILM implicitly assumes an <s> token in the beginning of each line and an </s> token at the end of each line and counts N-grams that start with <s> and end with </s>. You do not need to include these tags in file.txt.
Warning 2
When -kndiscount or -ukndiscount options are used, the count file contains modified counts. Specifically, all N-grams of the maximum order, and all N-grams that start with <s> have their regular counts c(a_z), but shorter N-grams that do not start with <s> have the number of unique words preceding them n(*a_z) instead. See the description of -kndiscount and -ukndiscount for details.

For most smoothing methods (except -count-lm) SRILM generates and uses N-gram model files in the ARPA format. A typical command to generate a model file would be:

 ngram-count -order N -text file.txt -lm file.lm
The ARPA format output file.lm will contain the following information about an N-gram on each line:
 log10(f(a_z)) <tab> a_z <tab> log10(bow(a_z))
Based on Equation 2, the first entry represents the base 10 logarithm of the conditional probability (logprob) for the N-gram a_z. This is followed by the actual words in the N-gram separated by spaces. The last and optional entry is the base-10 logarithm of the backoff weight for (n+1)-grams starting with a_z.
Warning 3
Both backoff and interpolated models are represented in the same format. This means interpolation is done during model building and represented in the ARPA format with logprob and backoff weight using equation (6).
Warning 4
Not all N-grams in the count file necessarily end up in the model file. The options -gtmin, -gt1min, ..., -gt9min specify the minimum counts for N-grams to be included in the LM (not only for Good-Turing discounting but for the other methods as well). By default all unigrams and bigrams are included, but for higher order N-grams only those with count >= 2 are included. Some exceptions arise, because if one N-gram is included in the model file, all its prefix N-grams have to be included as well. This causes some higher order 1-count N-grams to be included when using KN discounting, which uses modified counts as described in Warning 2.
Warning 5
Not all N-grams in the model file have backoff weights. The highest order N-grams do not need a backoff weight. For lower order N-grams backoff weights are only recorded for those that appear as the prefix of a longer N-gram included in the model. For other lower order N-grams the backoff weight is implicitly 1 (or 0, in log representation).

ngram(1), ngram-count(1), ngram-format(5),
S. F. Chen and J. Goodman, An Empirical Study of Smoothing Techniques for Language Modeling,'' TR-10-98, Computer Science Group, Harvard Univ., 1998.

## BUGS

Work in progress.

## AUTHOR

Deniz Yuret <dyuret@ku.edu.tr>
Andreas Stolcke <stolcke@speech.sri.com>

## December 16, 2007

### Verimli Düşünce Alışkanlıkları

Özet: Belirsiz bir gelecek karşısında doğru kararlar alabilmek için
elimizdeki bilgilerden uygun sonuçlara ulaşabilmemiz gerekir.
Bunun için önemli problemleri ilk başta keşfetmeli ve öncelik
sırasına koyabilmeli, yaratıcı çözümler üretebilmeliyiz. Bu
çözümler arasında seçim yapabilmek için arkalarındaki varsayımları
ve getirdikleri sonuçları görerek doğruluklarını tespit edebilmeli
ve farklı çözümlerin destekledikleri çelişen değerler arasında
denge kurabilmeliyiz. Doğru düşünme konusunda bir eğitim programı
planlarken hedef belirlemek, tam olarak ne istediğimizi anlamak
önemlidir düşüncesindeyim. Mantık, bilim, hukuk gibi disiplinler
fikirleri değerlendirme konusunda kendilerine özgü düşünce
yöntemleri geliştirmiştir. Bu raporda amacım farklı disiplinlerin
bize kazandırdıkları bu yöntemleri örneklendirmek, bu örneklerle
geliştirilebilecek eğitim programları için değerlendirme
kriterleri oluşturmaktır. Raporun sonundaki ekte ise düşünce
özetleyeceğim.

## November 03, 2007

### DARPA Urban Challenge

Bir süredir heyecanla takip ettiğim bir yarışmadan bahsetmek
istedim. DARPA uzun zamandır kendi kendine şoförsüz gidebilen
arabalar geliştirmek istiyor: http://www.darpa.mil/grandchallenge

Bu amaçla 2004 ve 2005'de iki yarışma düzenlediler. İlk yarışmada
150 millik çöl parkurunu sağ salim bitirebilen bir araç çıkmadı.
İkinci yarışmada ise 4-5 araba verilen zaman süresi içerisinde
parkuru tamamladılar. Birinci olan Stanford takımından Sebastian
Thrun'un konuşması izlemeye değer:

Bir sene içinde Stanford ve CMU'nun gösterdiği gelişmeyi görünce
ben 2020'lerde ise giderken araba kullanmayacağımıza kani oldum.

Bu sene üçüncü yarışma düzenleniyor. Bu sefer çölde değil şehir
içinde trafikte yarışacak katılanlar. MİT ilk kez katılıyor.
Yarışma bugün 8am'de (California time). Elemelerden görüntüler
için aşağıdaki siteyi tavsiye ederim:

http://www.tgdaily.com/content/view/34686/113/

## October 21, 2007

### King Kong ve Fizik

Hollywood filmlerinde görsel efektler için küçük maketler sıkça
kullanılır. Örneğin bir evin patlamasını çekmek için gerçek bir
evi parçalamak yerine küçük bir maket patlatılıp sonra film
yavaşlatıldığında doğru izlenimi verir.

Gelgelelim içerisinde su olan sahnelerde bu numara çalışmıyor.
Water world filmi gelmiş geçmiş en çok para kaybeden filmlerden
biri oldu bu yüzden, çünkü sahnelerin çoğu gerçek boyutta şu
üzerinde çekilmek zorunda idi. Bir küvete küçük kayıklar koyup
biraz suyu çalkalayın. Filme çekip sonra yavaşlatın. Dalgaların
ve hareketlerin çok gerçekçi görünmediklerini farkedeceksiniz.
Emrah'ın bahsettiği transformasyonlar nokta kütlelerden oluşan ve
sadece yerçekiminin etkili olduğu sistemler için 100% çalışıyor,
ama suyun akışkanlığı işi bozuyor.

Bu örneği öğrendiğimden beri bu tip simetrilerin korunduğu ya da
bozulduğu örnekler hep ilgimi çekmekte. Geçenlerde canlıların
boyutlarıyla ilgili çok güzel iki yazı keşfettim:

1. Favori bilim adamlarımdan JBS Haldane'nin hayvanların boyutlarının getirdiği fiziksel sınırlamalarla ilgili bir yazısı.

2. Bu fiziksel sonuçların ucuz hollywood sci-fi filmlerindeki devlere ve cücelere uygulanması.

Özetle canlıları düşünürken temel kavram şu: bir yaratığın boyunu
orantılı olarak iki katına çıkardığınızda, yüzey alanları dört
katına, hacim ve kütleler sekiz katına çıkıyor. Bunun ilginç
etkileri var. Örneğin büyük hayvanlar yüksek yerlerden
bırakıldıklarında daha hızlı düşüyorlar (Aristo'ya geri mi döndük
ne :) Sebebi yerçekiminin uyguladığı kuvvetin kütle ile ama hava
direncinin alan ile orantılı olması. Dolayısıyla yüksekten atlayan
bir insan olurken, bir fareye hiç birşey olmuyor (terminal
velocity 10km/h gibi). Bir at ise çorbaya dönüyor - eskiden
bizanslılar katapultlarla hastalıklı atları kale duvarlarından
içeriye fırlatıp hastalık yayarlarmış işgal sırasında. Filleri
hayvanat bahçelerinde hapis tutan etraflarındaki zayıf çitler
değil (onlar insanları dışarıda tutmak için), kazılmış olan 1m'lik
hendek (fil oraya düşerse hayatta kalamayacağını biliyor).

Diğer örnekleri kızım Asya'da (20 aylık) gözleyebiliyorum. Çok
üşüyor (enerji üretimi kütle ile, işi kaybı ise deri yüzeyiyle
doğru orantılı), dolayısıyla çok yiyor (orantılı olarak benim
günde 15-20 lt süt içmem gibi), ve ağırlığına göre çok güçlü (kas
ve kemik gücü kesit yüzey alanlarıyla, ağırlık ise hacimle
orantılı).

Bu basit ilişkileri kullanarak King Kong'un niye mümkün olmadığını
(ilk zıpladığında tüm kemikleri kırılırdı), Brontosaurus'un niye
kafasını yere yakın tutmak zorunda olduğunu (yukarıya uzatsa bacak
damarları sıvı basıncına dayanamayıp patlar, beynine kan çıkmadığı
için bayılırdı), dev böceklerin niye var olmadığını (akciğerleri
penetre etmiyor), insanların niye sihirli ışınlarla mikroskopik
boyutlara küçültülemeyeceğini (brownian motion, ışığın dalga boyu,
enerji üretimi vs vs) açıklayabiliyoruz.

Ne dersiniz, fizik 101 finalinde sorayım mı bunları bizim
öğrencilere? ;)

### Simetri üzerine

Simetri üzerine Emrah'ın bir yazısının düşündürdüğü bazı fikirler:

1. Enerji, momentum ve açısal momentumun korunumunu zaman ve mekan
simetrileriyle açıklama konusu: bunu hep fizikçi arkadaşlardan
duyarım ama bir türlü derivasyonunu göremedim, bunun hepimizin
anlayacağı basit bir derivasyonu var mı? Varsa niye liselerde bunu
öğretmezler?

2. Simetri kavramı matematik ve fizikte başlı başına bir eğitim
günlük değil genelleştirilmiş anlamıyla, yani "yapılan bir
değişiklikten sonra ilgilendiğimiz şeylerin aynı kalması" olarak
kullanıldığını not düşmek isterim. Ama hala "simetri kaybı" gibi
bazı terimlerin tam ne anlama geldiğini tam anlamış değilim.

3. Bilimin başarısı lokal deneylerin bize çok uzak ve çok farklı
ortamlarla ilgili geçerli bilgiler vermesine dayalı diyoruz.
Psikoloji ve sosyolojinin başarısızlığının sebebini burada
arayabilir miyiz? Örneğin bir insan, bir ülke, ya da bir tarih
için gözlediğimiz şartlar başka bir insan, ülke, veya tarihte aynı
sonucu vermiyorlar genelde (tabi "şartlar"ı yeterince ayrıntılı
almamamızdan olabilir). Dolayısıyla ortada basit bir insan, yer,
ve zaman simetrisi yok gibi - bu da bilimin elindeki en güçlü
araçlardan birini alıyor. Tarih ve ekonomi gibi bilimlerde "deney"
yapmanın zorluğundan hep bahsedilir, ama belki esas problem deney
yapabilsek bile bunların benzer sonuçlar vermeyecek oluşu. Bu
olaylara belli simetrilerin olduğu seviyelerde bakmadığımız sürece
durum ümitsiz gibi görünüyor.

4. Neyin bilimsel olarak açıklanabilir neyin açıklanamaz olduğu
konusunda önceki yazdılarımda "algorithmic complexity"'ye gönderme
yaptım. Bir dizi (olaylar, sayılar, bitler dizisi), ya bir miktar
açıklanabilir (compress edilebilir, predict edilebilir), ya da
tamamiyle random'dur (random'luğun tanımını böyle yapıyor
teori). Ama bu potansiyel olarak uzayabilecek tartışmayı sonraya
bırakıyorum.

### Düşünce deneylerine iyi bir örnek

Üniversitede fizik hocamdan duyduğum bir düşünce deneyi hikayesi
geldi aklıma. Düşünce deneylerinin iyi düşünüldüğü zaman ne kadar
etkili olabileceğine bir örnek:

Aristo'nun fiziğinde düz giden cisimler ittirilmezlerse bir süre
sonra dururlar. İki cisim bırakıldığında ağır olan önce
düşer. Bunlar günlük deneyimimizle tutarlı gibi görünen
sonuçlar. Ayrıntılı bir deney yapılmadığında bir çocuğa da
sorsanız bu cevapları alırsınız basit düşünce deneylerinden. Bugün
Aristo'yu suçluyoruz, adam niye bir kuleye çıkıp da denememiş
diye. Ama suç gerçekten düşünce deneyinde mi?

Dikkatle tasarlanmış bir düşünce deneyi bize gerçeği gösterebilir.
Örneğin Pisa kulesinin tepesinden bir tuğla bıraktığımızda
diyelim yere düşmesi 10 saniye sürsün. Birincisine eş ikinci bir
tuğla bıraktığımızda onun da 10 saniyede düşeceğine herhalde
Aristo'nun da bir şüphesi yoktu. Peki iki tuğlayı aynı anda
bıraksak? Birbirlerini etkilediklerini düşünmediğimize göre yine
birlikte on saniye sonra yere çarpacaklar. Şimdi çarpıcı noktaya
geldik: iki eş tuğlanın arasına bir damla japon yapıştırıcısı
damlatıp bunları iki kat büyüklükte tek bir kütle haline
getirelim. Bu yeni kütleyi bıraktığımızda kaç saniyede düşmesini
bekliyoruz? Bunun bir önceki deneyden gerçekten bir farkı var mı?
Birinde iki tuğla yanyana düşüyor, diğerinde yine yanyana
birbirlerine yapışık düşüyorlar. Sanırım Aristo vaktiyle bu
deneyi düşünse Galileo'dan 2000 yıl önce doğru çözümü gorebilirdi.

### Bilim budalalığı

Bilim budalalığı konusunda geçenlerde seyrettiğim Adam Curtis'in
Pandora's Box belgeselini tavsiye ederim.

Altı bölümlük bu belgeselde yapımcı, komünist Rusya'nın rasyonel
ve mekanik bir toplum kurma hayalinden, atomik enerji konusunda
skandallarına kadar uzanan geniş bir yelpazede bilimin bize
yaşattığı hayal kırıklıklarını incelemiş.

Anlattığı altı hikayenin hepsinde benim görebildiğim senaryo
benzer: belki geliştirilip olgunlaştırılması için on yıllar ya da
yüz yıllar gerekebilecek teknolojiler bilim adamları tarafından
kişisel sebeplerle over-sell ediliyor. Politikacılar gördükleri
prototiplere güvenip halka tutamayacakları sözler
veriyorlar. Sonuçta büyük bir zaman baskısı altında olgunlaşmamış
teknolojiler üretime geçiyor. Riskler hakkında yalanlar
söyleniyor, doğruyu söylemeye çalışan dürüst bilim adamları
susturuluyor. Sonunda beklenen felaket gerçekleştiğinde insanlar
bilimi suçluyorlar.

Bu senaryonun gösterdiği problemin çözümünü bilimde değil
sosyoloji ve politikada aramak gerek diye düşünüyorum. Bilim
sonuçta sadece "bilmek" kökünden gelen bir kelime, bilmenin de
kimseye bir zararı yok (eğer Popper'ın açık toplumuna, Mill'in
düşünce özgürlüğüne inanıyorsanız). Tüm felaketler bilmediğimiz
konularda biliyormuş gibi davranmaktan ya da bildiğimiz bazı
şeyleri saklamaktan geliyor.

Nuray Mert'in bilimle ilgili yazısını ise neresinden tutayım
bilemiyorum. Memduh sanki sırf benim tüylerimi diken diken etmek
amacıyla göndermiş. Sanki çok bilime gark olmuş bir ülkeymişiz
gibi... Aşağıda bir iki kavram karmaşası ve factual error

1. "bilimin nefesinin tükendiği konular": burada bugünkü bilimin
açıklayamadığı şeyler mi kasıt, yoksa prensipte açıklanamayacak
şeyler mi emin değilim. Her halükarda yazının genelinden bilimsel
yöntemle, günümüzün bilimsel bilgilerinin birbiriyle
karıştırıldığı izlenimindeyim. Bugün bildiklerimizin yanlışlığının
her an ispatlanabileceği bilimin tanımında var zaten. Ama
yapabildiğimizin en iyisi bu, daha iyisini bilen varsa hodri
meydan.

2. "bu gayret bugüne kadar varoluşa dair temel soruları çözmekte
işe yaramadı" ilginç bir gözlem. Sonuçta modern insan 150,000
yıldır bu dünyada dolaşıyor. Üç yüz yıllık bir çaba sonucu evrenin
en uzak noktalarıyla, maddenin en küçük yapıtaşlarıyla ilgili
fikirler üretiyor, cansız moleküllere can verme noktasına
geliyoruz. Nuray hanımı tatmin etmek için hangi "temel sorulara"
cevap vermemiz gerekecek acaba?

3. "Aydınlanma felsefesi, bilimsel düşünceyi temel alan dünya
görüşünün, toplumsal siyasal meseleleri de halledeceğini iddia
ediyor, umuyordu. Öyle olmadı." Bu, eğer cümlenin öznesini yanlış
okumadıysam "aydınlanma felsefesi" nin bir problemi, bilimin değil
:) Tabi iki paragraf sonra yazarımız bilime karşı felsefeyi
savunarak benim kafamı temelli karıştırıyor: "bu sorular hâlâ
felsefi düzeyde tartılması gereken konularmış."

Toplumsala bilimin uygulanması konusu yukarıda bahsi geçen
belgeselin ilk bölümünde Rusya örneğiyle ele alınıyor. Bilim şu
anda gerek insan psikolojisini gerek toplum davranışını başarılı
bir şekilde açıklamaktan çok uzak. Bu gerçeği çarpıtıp
bilmediğimiz konularda birşeyler biliyormuşuz gibi davranmak ancak
insanların zayıflığı ve irrasyonelliğinin sonucu, fazla
rasyonelliğin değil.

4. "İnsanlar, ortaçağda da birbirlerinin gözünü oyuyordu, bugün de
Pinker'in vahşetin tarihsel düşüşü ile ilgili TED konuşmasını
tavsiye ediyorum. Burada Nuray hanımın söyledikleri tamamiyle
yanıltıcı. 20. yüzyılın dünya savaşlarını bile düşündüğümüzde
eğer ölüm oranı eski dünyanın kabile savaşları ile aynı olsaydı
100 milyon değil 2 milyar insanın ölmesini beklerdik. Tanıdığınız
erkeklerin yarısının 30 yaşını fazla geçmeden başka bir erkeğin
eliyle ölmesi normdu birkaç yüzyıl öncesine kadar.

5. "Okullarda bilimsel bilgi öğretilmesin diyen yok, ama bilimsel
bilgi mutlak gerçek yerine konulmasın deme hakkımız yok mu?"
Bilimsel bilginin mutlak gerçek olduğunu iddia eden var mı acaba?
Bunun bilimin tanımına aykırı olduğunu açıklayarak okuyucularımın
zamanını almayacağım.

Belki de beni en rahatsız eden yazarın alternatif sunmayarak olan
gelişmeleri tek taraflı eleştirmesi. Onu değil de şunu yapsaydık
demiyor, geleceğe yönelik bir tavsiyede bulunmuyor. Hayata dair
bilgilenme çeşitlerine başka başarılı örnekler sunmuyor.
İnsanların ve kurumların yapılarından kaynaklanan bir takım
sorunları bilimin suçuymuş gibi sunup her türlü alternatif
crackpot felsefeye yeşil ışık yakıyor. Tam da günümüzde
ihtiyacımız olan bakış açısı (!).

### Nedenciler ve Nasılcılar

Nedenci nasılcı ayrımı konusunda bir süredir kafama takılan bir
soru var. Memduh'un deyimiyle:

> Ama galiba bütün bu sorgulamaların sonunda cevap ya
> (i) fiziksel bir sebebe bağlanacak, veya
> (ii) (bireyin. evrimin, Tanrının) aklıyla, isteğiyle olacak

Burada fiziksel bir sebepten kasıt eminim bugünkü bilimsel
bilgimizle sınırlı bir açıklama değil. Örneğin bugün evrende dört
ayrı kuvvet olduğunu düşünüyor tüm gözlemlerimizi bu dört kuvvetin
bir sonucu olarak açıklamaya çalışıyoruz. Ama yarın bu dört
kuvvetin açıklayamadığı bir deney yapılsa bu bilimin sonu olmaz,
ya beşinci bir kuvvet eklenir, ya da kuvvetlerin ötesinde bambaşka
bir kavram çatısı bulunur.

Bilimi bu geniş anlamıyla aldığımızda bir olgunun (örneğin bilinç
gibi) bilimsel yöntemle açıklanması ne demek önce ona bakalım:
Başarılı bir bilimsel açıklama bize o güne kadar yapılmış
gözlemlerle tutarlı ve gelecekte yapabileceğimiz gözlemlere
yönelik test edilebilir tahminlerde bulunabileceğimiz bir hipotez,
bir teori sunuyor. Dolayısıyla örneğin başarılı bir psikolojik
teori bulduğumuz zaman (şu an böyle bir teori henüz yok) bu benim
hangi koşullarda neyi isteyeceğimi, nelere inanacağımı, nasıl
davranacağımı prensipte başarılı şekilde tahmin etmemizi
sağlayabilir. Bu teoriyi kullanarak benzer özelliklere sahip
robotlar, simülasyonlar geliştirebiliriz vs.

Bu noktada ilk sorum şu: nedencilerin açıklama olarak
kastettikleri şey tam olarak nedir? Örneğin tanrı istedi de öyle
oldu, ya da ben istedim de öyle yaptım gibi ifadeler birer
açıklama mı? Eğer öyle ise bu iki grup insanın "açıklama" kavramı
ile kastettikleri çok ayrı şeyler. Dolayısıyla bu yanlış bir
dikotomi, iki grup aynı sorulara cevap aramıyor. Nedencilerin
açıklamalarında gelecekle ilgili tahminlerde bulunabilme isteği ya
da potansiyeli yok. Zaten böyle tahminlerde bulunabilsek (örneğin
bireylerin ya da Tanrı'nın yarın ne isteyeceği üzerine başarılı
bir teorimiz olsa), o zaman nasılcılardan bir farkımız kalmazdı.

Bir ihtimal nedenciler bazı şeylerin prensipte bilimsel yöntemle
açıklanamayacağına inanıyorlar. Peki o zaman bir olgunun bilimsel
yöntemle prensipte açıklanamaması ne demek? Örneğin Penrose
bilinci açıklamak için yeni kuantum özellikler ararken bugünkü
bilimin yetersiz olduğuna mı inanıyor, yoksa bilimin genel olarak
bilinci açıklayamayacağına mı? Bilimsel açıklama ve tahminlere bir
bakış açısı bugüne kadar gözlenmiş bir sayı dizisinin bundan
sonraki terimlerini tahmin etmek olarak özetlenebilir. Bu durumda
bu tahminin prensipte yapılamayacağını iddia etmek bu dizinin
tamamiyle rastgele (random) olduğunu iddia etmeye denk. Bu durumda
nedenciler Tanrı'nın ya da insan bilincinin bir seviyede püre
randomness'dan geldiğine mi inanıyorlar?

Diğer bir ihtimal, nedencilerin bilimsel açıklamalarla bir alıp
veremedikleri yok, sadece bu tip açıklamalar onları
ilgilendirmiyor, ya da onları tatmin etmiyor. Buna söyleyecek
birşey yok tabi, ama o zaman ortada kavramsal olarak önemli bir
dikotomi yerine kişisel tercihler var dememiz lazim...

## September 26, 2007

### Düşünce deneyleri

Bu aralar bilinç üzerine ne varsa okuyorum. Tartışmaların çoğu
düşünce deneyleri üzerine dönüyor. Birkaç örnek vermek gerekirse:

* Eğer bir yarasa olsak kendimizi nasıl hissederdik?

* Star trek usulü bir scanner ile tüm atomlarınızı tarayıp sizi
başka bir yerde yeniden oluştursak hangi sız gerçek siz olurdunuz?

* Benimle aynı fiziksel konfigürasyona sahip olan (atomlar,
moleküller) ama içinde bir "ben" olmayan bir yaratık (zombie)
olabilir mi?

* Gelecekte beynin tüm sırlarını çözmüş olalım. Mary, bir renk
uzmanı, insanın renk algılarıyla ilgili bütün mekanizmaları en
ince ayrıntısına kadar biliyor olsun. Ama Mary ömrü boyunca siyah
beyaz bir odada oturup dünyayı siyah beyaz bir monitörden
izlesin. Sonunda bir gün dışarı çıkıp gerçek renkleri gördüğünde
yeni birşey öğrenir mi?

Beni bu konuda düşünmeye iten küçük bir deneyi blog'uma koydum:

http://denizyuret.blogspot.com/2007/09/hearing.html

Düşünce deneylerinin kötü tarafı fiziksel deneyler gibi
kendilerini tutarlı bir gerçeğe bağlayan güçlü bir bağ
olmaması. (Fiziksel gerçeğin neden tutarlı olduğu ayrı bir
tartışma konusu - elmanın dünyaya düşmesini sağlayan mekanizma ile
ayı dünyanın etrafında döndüren mekanizmanın aynı kurallarla
açıklanması, bu yüzyılda yaşamasak hayli şaşırtıcı olurdu.) O
zaman düşünce deneylerinin gördüğü fonksiyon birer sezgi pompası
(intuition pump) rolü oynayıp kafamızda zaten var olan bir takım
sezgileri daha ön plana çıkararak eldeki probleme uygulamamızı
sağlamak. Tabi başlangıçtaki sezgilerimiz yanlışsa vardığımız
sonuçlar da yanlış oluyor.

Tabi tüm düşünce deneyleri yukarıdakiler gibi yanıltıcı değil.
Einstein usulü fiziksel "gedankenexperiment" lar fiziksel olarak
yapılması mümkün olmayan (en azından o gün için) deneylerle
insanların düşüncelerini organize etmeye yarayabiliyorlar.

Yine de dikkatli olmak lazım. Sadece bazı filozofların zombie'ler
olduğunu hayal edebiliyor olması zombie'lerin gerçekten
olabileceğini ya da bilincin fizik dışı bir komponenti olduğunu
ispatlamıyor. Ya da bazı filozofların "hisseden robot"
yapılabileceğini hayal edemiyor olması da onların hayal gücünün
bir problemi, herhangi bir gerçeğin ispatı değil.

"Bir gün bir adaya düştüğünde..." diye başlayan düşünce
deneylerinde de problem hayal gücümüzün sınırları ile bağımlı
olmaları. Bir adaya düştüğümüzde gerçekten olabilecek pek çok şeyi
şu anki sınırlı deneyimimizle hayal edemiyor olabiliriz. Fiziksel
deneylerin güzelliği deneyi yapanın hayal gücüne bağımlı
olmamaları ve bazan hayal etmeyi aklımızdan geçirmediğimiz
sonuçlar verebilmeleri.

Belki bundan da önemlisi Galile'nin İtalya da 1650'de yaptığı
deneyin şu an İstanbul'da 2007'de (ya da Andromeda Galaksisinde,
bundan bir milyar yıl sonra) da aynı sonucu vermesi. Bunu biliyor
muyuz, inanıyor muyuz, iman mı ediyoruz orası tartisilir.

## September 23, 2007

### Hearing

In his book "Auditory Scene Analysis", author Albert S. Bregman likens the ear canals to two narrow channels on the edge of a lake and sound waves to water waves:

"Your friend digs two narrow channels up from the side of the lake. Each is a few feet long, and a few inches wide, and they are spaced a few feet apart. Halfway up each one, your friend stretches a handkerchief and fastens it to the sides of the channel. As waves reach the side of the lake they travel up the channels and cause the two handkerchiefs to go into motion. You are allowed to look only at the handkerchiefs and from their motions to answer a series of questions: How many boats are there on the lake, and where are they? Which is the most powerful one? Which one is closer? Is the wind blowing? Has any large object been dropped suddenly into the lake?"

As impossible as this task sounds, it is analogous to the work performed by your auditory system.

Here is a small experiment. Listen to this first recording and try to guess what it is:

No, it is not some wild animal or an alien. It is just human speech (singing) and some instruments, slowed down. Here is the original, if you are curious:

Now try to listen to the first recording again and see if you can figure out what the words are and where they start and end. People in general do not have a good appreciation of how difficult the problems of perception are (unless they are trying to build a machine to solve these problems). We have been working on speech recognition for decades but the best programs still do not perform very well except in very restricted contexts. Yet we recognize speech so effortlessly that it is difficult to see what the big deal is. Trying to recognize the words in the first recording may help you appreciate the computer's difficulty.

The thing that struck me about the first recording when I first heard it was how different it "felt" than human speech. Although it contained exactly the same information as the original its "quale" was different in philosopher-speak. This reminded me of a famous thought experiment devised by philosopher Frank Jackson: Mary the color scientist.

Mary lives in the far future when neuroscience is complete and scientists know everything there is to know about the physical processes in the brain. She has studied and learned everything there is to know about color perception: the optics of the eye, the properties of colored objects, the processing of color information in the visual system, and how this information leads to actions, memories, feelings etc. But Mary has been brought up all her life in a black and white room, she has never seen any colors at all. One day Mary is let out of her black and white room and sees colors for the first time. What happens? Does she learn anything new?

Frank Jackson argues that she obviously learns something fundamentally new: what red is like, its raw feel, its quale.

When I listen to the first recording I think of an alien speech scientist, trying to decipher the message hidden in the signal. The alien can train itself and become an expert at recognizing the words upon hearing the signal. But will it ever get the same "quale" as we get when we listen to the original recording?

Philosopher Dan Dennett argues that there are no such things as qualia and Mary will not learn anything new when she sees the colors for the first time. That is, of course, if we take the premises seriously: that she knows EVERYTHING there is to know about color perception. As counterintuitive as this sounds, I find that when the subject is the "mind", familiar and intuitive is usually wrong.

Full post...

## July 25, 2007

### Tanrının işine karışmak

Nicholas D. Kristof taşıyıcı annelik ve genetik manipülasyonlar
konusunda yazdığı yazısını şöyle bitiriyor:

"What should cross the line into illegality is fiddling with the
heritable DNA of humans to make them smarter, faster or more pious
or more deaf. That is playing God not just with a particular embryo
but with our species, and we should ban it."

"Playing God" deyimi bu tip tartışmalarda çok geçmeye başladı. Ne anlama geldiği belirsiz, içi boş, ama negatif duygular uyandıran ve dolayısıyla her türlü önyargıyı içinde barındırmaya açık bir deyim. Temiz argümanlar kullanmanın önemli olduğunu düşünüyorum ve Kristof'un yazısının bu testi geçtiğine çok emin değilim.

Bana Craig Venter'e son projesi için izin verip vermemeye karar veren
grubun içinde bütün büyük dinlerden temsilcilerin olmasını
hatırlatıyor. Temsilciler kendi inandıkları kitapları yorumlayıp
"cansızdan canlı yaratma" konusunda bir yasak olup olmadığına karar
vermeye çalıştılar. Din konusundaki görüşünüz ne olursa olsun, umarım
bu tip konularda karar verirken ne gibi kriterler kullanacağımızın
hepimizin üzerinde anlaşabileceğimiz şekilde daha net belirlenmesi
gerektiği konusunda bana katılıyorsunuzdur.

Bu kriterlerin ne olması gerektiği konusunda anlaşabilir miyiz?
Ekonomik kriterlerin daha dikkatli analizi çözümün bir parçası
olabilir. Dennett ve Drescher gibi filozofların üzerinde uğraştığı
"Getting ought from is", yani temel gerçeklerden başlayıp etik
prensiplere akıl yoluyla ulaşma projesi belki bir diğer parçası.

Problemin zor tarafı hedefin ne olduğunu belirlemek gibi geliyor bana.
Hedef tam olarak belirlendiğinde bunun uygulamaya nasıl döküleceği
bir sosyal mühendislik problemi. İnsanların neslinin sürdürülmesi,
refah içinde yaşanması, sağlık, mutluluk, nedir tam olarak amacımız.
Önümüzdeki yüzyılda transhümanistlerin rüyaları gerçek olmaya başlarsa
iki bin yıldır çok değişmeyen bazı temel hedefleri sorgulamamız
gerekecek mi?

Net bir hedef üzerinde anlaştığımızda ve bu hedefe bizi götürecek
mühendislik çözümler bulunduğunda "playıng God" gibi içi boş
terimlerle birbirimizi oyalamayı bırakacağız diye umuyorum.

## June 28, 2007

### The CoNLL 2007 Shared Task on Dependency Parsing

Joakim Nivre, Johan Hall, Sandra Kübler, Ryan McDonald, Jens Nilsson, Sebastian Riedel and Deniz Yuret. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

Abstract: The Conference on Computational Natural Language Learning features a shared task, in which participants train and test their learning systems on the same data sets. In 2007, as in 2006, the shared task has been devoted to dependency parsing, this year with both a multilingual track and a domain adaptation track. In this paper, we define the tasks of the different tracks and describe how the data sets were created from existing treebanks for ten languages. In addition, we characterize the different approaches of the participating systems, report the test results, and provide
a first analysis of these results.

## June 26, 2007

### Evrenin israfkarlığı

İnsan genetiği belirlendiginde ilginç bir gerçek ortaya çıktı:
Proteinleri tanımlayıp özelliklerimizi belirleyen genler DNA'nın
sadece yüzde üçünü kaplıyorlar. Bu biraz israf gibi gelmiyor mu
insana?

1. Micro-RNA'leri kodlayan DNA bölümleri de var, fakat
micro-RNA'lerin sayısı protein kodlayan genlerin sayısı ile aynı
büyüklükte gibi. Micro-RNA'lerin boyları da daha küçük olduğundan
3% hadi çıkmış olsun 10%'a. Hala bir 90% var sanki "junk" olarak.

2. Tabi bu bölge başka şeyler kodlayabilir, RNA kodlayan yerlerin
birden fazla kopyası olabilir vs. Fakat DNA'de hiç birşey
kodlamadığı belli olan uzun tekrarlayan diziler olduğunu biliyorum
(ATATATATATA) gibi. İşten anlayanlarımız bunların DNA'nın ne

3. İşin garibi bazı bakteriler bu konuda bizden çok daha verimli
kodlama yapıyor, DNA'larının neredeyse 100%'unu kullanıyorlar
(yanlış hatırlıyorsam düzeltin). 3% rakamı insanlara ve diğer
gelişmiş hayvanlara özel.

4. Konuyla alakası yok ama bazı basit bitkilerin insanlardan çok
daha fazla gene sahip olması da ilginç değil mi?

5. Peki insan vücudunun 60% su olmasına ne demeli.

6. Ya da atomların içinin 99.9999999% boş olmasına? Katı madde
dediğimiz şeylerin büyük oranda boş olması 20. yüzyıl fiziğinin en
ilginç bulgularından biri. Eğer çekirdek bir sinek ise,
elektronlar onun etrafında futbol sahası boyunda yörüngelerde
dolaşıyorlar.

7. Son olarak eğer çok-evrenli kuantum yorumları doğru ise evren
her an neredeyse sonsuz dala ayrılmakta, bu dalların her biri
algılayabiliyoruz (tabi diğer kopyalarımız diğerlerini
algılıyorlar) diğerleri ile ancak dikkatli fizik deneyleri
sırasında etkileşebiliyoruz. Ne büyük israf demek geliyor insanın
içinden. Ama elinde sonsuz miktarda kaynak olan bir Tanrı için
israf anlamsız bir kavram olmalı :)

### Düşünceleri okuyabilmek

Prague'daki Association for Computational Linguistics
konferansındayım. Açılış konuşmasını Mitch Marcus yaptı. Kendisi
"Machine Learning" alanındaki çalışmaları ve klasik ders kitabı
ile tanınır. Son zamanlarda uğraştığı işleri anlattı, ilginç
bulduğum için paylaşmak istedim:

Araştırmada beyninizi tarayan fMRI aletinin içine giriyorsunuz.
Karşınıza her on saniyede bir okumanız için bir kelime çıkıyor.
fMRI'in çektiği beyin aktivite filminizi gördüğünüz kelimelerle
ilişkilendirip daha sonra sadece beyin aktivitenizden düşündüğünüz
kelimeyi bulmaya çalışıyorlar.

İlk deneylerde iki grup kelime ile başlamışlar: araçlar
(tornavida, çekiç vs), ve yerler (ev, ofis vs). Modeller
düşündüğünüz kelimenin hangi sınıfa ait olduğunu 100% ayırt
edebiliyor. Tam olarak hangi kelimeyi düşündüğünüz konusunda şu an
performans biraz daha düşük. İnsandan insana değişmesine rağmen
70% - 95% arasında gibi sınırlı sayıda kelime kümeleri için. Daha
sonra yapılan deneyler kelime okuma yerine aynı objelerin
resimlerine bakma (performans bu durumda daha da iyi), resimlere
bakarken öğrenilen beyin aktiviteleriyle daha sonra okuduğunuz
kelimeleri tahmin etme (bu gözlenen aktivitenin tek bir modalıty'e
(yazı, ses, görüntü gibi) özel olmadığını ve gerçekten o
kavramları gösterdiğine işaret), bir insandan öğrenilen modellerle
diğer insanın ne düşündüğünü tahmin etme (performans biraz daha
düşse de hala çok fena değil), bir dilde kelimeler okuyarak
öğrenilen modellerle başka dilde düşünülen kelimeleri ayırt etme,
ilgili kelimelerden elde edilen bir modelle daha evvel görünmeyen
kelimeleri sınıflandırma vs gibi.

Tabi üç yaşından beri beyin okuma aletleri hayal eden bendeniz
için çok heyecan verici bir konuşmaydı. Türkiye'ye döner dönmez
Amerikan Hastanesine gidip fMRI aleti ile nasıl oynayabileceğimin
yollarını araştırmak istiyorum.

İşin ilginci herkes benim kadar heyecanlı değildi. Bazı
arkadaşların bu teknolojinin gideceği yerleri düşünüp çok rahatsız
olduklarını farkettim. fMRİ taşınabilir derecede küçülüp
hassasiyet düşünülen cümleleri ayırt edecek kadar artırıldığında
kafalarına aluminyum folyo sarıp dolaşmaları gerekecek.

Bense bu mahremiyet konusunda artık teslim olmuş durumdayım. Nasıl
iyi. İnsanlar kendi doğalarını kabul edip standartlarını ona göre
ayarlamalı. Internet trafiğinin 50% porno olacağını Tim Berners
Lee tahmin edemezdi herhalde. Ama başımıza taş yağdığı falan yok.

Bu konuda bildiğim bir David Brin ilginç birşeyler düşünüyor -
açılma güçlüler-güçsüzler, devlet-vatandaşlar vs arasında tek
yönlü olursa problem büyük. O tek yönü engelleyemeyeceğimize göre
en azından çift yönlü olması için bir kültür devrimine başlamamız
gerek. İngiltere'de sokak kameralarını izleyen karakollardaki
memurları vatandaşların da kameralarla izleyebilmesi önerisi
gibi. Neyse bu konu başka bir mesaj eder.

Bu teknolojinin beni heyecanlandıran getirileri ise saymakla
bitmez. Birincisi beyinde kavramların ve anlamların ne şekilde
saklandığı konusunda 2500 senedir herkes birşeyler atıp tutuyor,
hala en ufak bir fikrimiz yok. Önümüzdeki senelerde bu konuda
büyük ilerleme sağlanabilir. Kurzweil'in her zaman iddia ettiği
gibi Aİ'i önümüzdeki 50 yılda biz çözemesek de beyin resmetme
aletleri yeterince hassaslaştığında reverse-engineering daha kolay
bir çözüm olabilir. Daha yakın dönemde bilgisayarlar, internet,
ve hatta diğer insanlarla iletişimimiz için klavye, fare, cep
telefonu gibi gereçlere gerek kalmayabilir - beyne enjekte
edilecek küçük bir wireless internet çipi ile telepatı sonunda
gerçek olabilir. (Tabi call-waiting ve answering-machine benzeri
filtreler bu durumda daha da önem kazanıyor). Politika ve hukuk
konusundaki potansiyeller - yalan makineleri yerine sanığın
düşüncelerini okuyan bir laptop düşünün. Amerikan başkanı seçimi
kazandığında elini incile koyup standart laflar edeceğine,
seçmenler kafasını fMRİ'in içine sokup gerçekten ne düşündüğünü
öğrenmek isteyebilirler. Daha sayayım mi?

## June 25, 2007

### Targeted Textual Entailments - A Proposal

1. Definition: A targeted textual entailment (TTE) task uses entailment questions to test a specific competence of a system, such as word sense disambiguation, semantic relation recognition, or parsing. Even if we do not know the best theory underlying a competence, we know what having that competence enables people to do. For example:

1.1 WSD
"They had a board meeting today."
==> "They had a committee meeting today." [yes]
==> "They had a plank meeting today." [no]

1.2 SemRel
"John opened the car door."
==> "The door is part of the car." [yes]
==> "The car produced the door." [no]

1.3 Parsing
"I saw the bird using a telescope."
==> "I used a telescope" [yes]
==> "The bird used a telescope" [no]

2.1 Currently most shared tasks use or favor a specific inventory, representation, or linguistic theory. In WSD, WordNet is often used as the sense inventory although everybody complains about it. We have FrameNet, Propbank, Nombank, various logical formalisms and different sets of noun-noun relations people work on in the semantic relations area. The parsing community is split into a constituency group and a dependency group that rarely compare results. Formulating TTE tasks in these fields will help test systems on a level playing field no matter which inventory, presentation, or linguistic theory they use.

2.2 Large annotation efforts struggle to achieve high inter annotator agreement (ITA). My hypothesis is most annotators understand the sentences they are supposed to annotate equally well, but do not understand the formalism good enough for consistent labeling. By asking simple entailment questions where all they need to do is choose yes/no/uncertain, it is hoped that (i) no education in a particular formalism will be needed for annotators, (ii) annotation will proceed faster, and (iii) final ITA will be higher. (We throw away the examples that get a lot of "uncertain" answers).

3. Methodology: For a TTE task to be useful and challenging the examples should be chosen close to the border that separates the positives from the negatives. In other words, the positive examples should have non-trivial alternatives and the negative examples should be "near-misses". (My examples in Section 1 are not all very good according to this criteria). For example in the WSD TTE task (which is basically lexical substitution), the substitute should be chosen such that (i) it is a near-synonym for one of the target's senses, and/or (ii) it has a high probability of occuring in the given context. In the parsing task, examples should be based on decision points where a typical parser can go either way or where the n-best parses disagree. This suggests that examples can be generated automatically by taking the best automated systems of the day and focusing on decisions about which they are least confident. This type of "active learning" methodology will uncover weaknesses that the next generation systems can focus on.

## June 23, 2007

### KU: Word Sense Disambiguation by Substitution

Deniz Yuret. In Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

Abstract: Data sparsity is one of the main factors that make word sense disambiguation (WSD) difficult. To overcome this problem we need to find effective ways to use resources other than sense labeled data. In this paper I describe a WSD system that uses a statistical language model based on a large unannotated corpus. The model is used to evaluate the likelihood of various substitutes for a word in a given context. These likelihoods are then used to determine the best sense for the word in novel contexts. The resulting system participated in three tasks in the SemEval 2007 workshop. The WSD of prepositions task proved to be challenging for the system, possibly illustrating some of its limitations: e.g. not all words have good substitutes. The system achieved promising results for the English lexical sample and English lexical substitution tasks.

### SemEval-2007 Task 04: Classification of Semantic Relations between Nominals

Roxana Girju, Preslav Nakov, Vivi Nastase, Stan Szpakowicz, Peter Turney and Deniz Yuret. In Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

Abstract: The NLP community has shown a renewed interest in deeper semantic analyses, among them automatic recognition of relations between pairs of words in a text. We present an evaluation task designed to provide a framework for comparing different approaches to classifying semantic relations between nominals in a sentence. This is part of SemEval, the 4th edition of the semantic evaluation event previously known as SensEval. We define the task, describe the training/test data and their creation, list the participating systems and discuss their results. There were 14 teams who submitted 15 systems.

## June 03, 2007

Uzun zamandır bilinç ilintili konularda kafamı en çok kurcalayan
problem "irade" (ya da "istek", "motivasyon", "dürtü") problemi.
Sabah uyandığımızda neden hiç bir şey yapmamaktansa birşeyler
yapmayı yeğliyoruz? Davranışlarımızı üreten temel kaynak nedir?
Herhangi birşey yaptığımda "neden yaptım" diye düşünürsem, bir iki
rasyonalizasyondan sonra son cevabım "canım öyle istedi de ondan"
oluyor. Bu cevap (ya da genel olarak "canın birşey istemesi"
kavramı) sizde de derin bir beyin sansürü şüphesi yaratmıyor mu?
Eğer ben insanları kontrol etmek isteyen kötü bir uzaylı olsam,
onlara zorla birşey yaptırmaya çalışmak yerine, beyinlerinin "can
istemesi" sistemleri ile oynardım - böylece benim amaçlarımı kendi
"canları istediği" için yerine getirirler ve canlarının neyi neden
istediği konusunda bir fikre sahip olmadıkları için de varlığımdan
şüphelenmezlerdi.

Bu problemin önemi arada bir çalışma motivasyonumu yitirdiğimde
kafama dank ediyor. İnsanın "istediği" birşeyi yapmasıyla
"istemediği" birşeyi kendine zorla yaptırması arasında büyük fark
var - ortaya çıkan "davranış" Skinner'in bakışıyla ayırdedilemez
olsa da.

Geçen sene ünlü biyolog J.B.S. Haldane'nin aşağıdaki sözünü
bulunca bu konudaki merakımda yanlız olmadığımı farkettim:

"Kendi motivasyonuma dair şahsi açıklamalarımın hemen her durumda
tamamen uydurma olduğu sonucuna varmış bulunuyorum. Neyi neden
yaptığımı bilmiyorum." -- J.B.S. Haldane

İşin üzücü tarafı benim geldiğim yapay zeka kültüründe böyle bir
problemin önemi vurgulanmıyor. Klasik bir yapay zeka sisteminde
algılarla dünyanın durumu belirlenir, amaçlar doğrultusunda plan
yapılır ve bu planlar davranışa dönüştürülür. Peki amaçlar nereden
gelir? Programı yazandan tabi ki. Dolayısıyla üzerinde en az
düşünülen kısım olagelmiş motivasyon. Geçenlerde kurzweilai.net'de
modern YZ'cılardan Ben Goertzel'in kendi cognitive modeli üzerine
yazdıklarını okurken aşağıdaki paragrafı gördüm (kötü olduğu için
çevirmeye uğraşmayacağım):

"And if a system can recognize itself, it can recognize
probabilistic relationships between itself and various effects in
the world. It can recognize patterns of the form "If I do X, then
Y is likely to occur." This leads to the pattern known as will."
-- Ben Goertzel

Kısacası davranışlar ve olası sonuçları hakkındaki bilgiler ile bu
davranışların hangilerinin ne zaman ne sebeple aktive edildiği
konuları birbirine karıştırılmış.

Bu genel bir körlük bilgisayarcılar arasında belki. Düşünürseniz
dünyanın en sofistike programları vakitlerinin çoğunu sizin bir
düğmeye basmanızı bekleyerek harcıyorlar! Kendi kendine günler
boyunca ilginç birşeyler yapabilen bir program yazamadık bugüne
kadar. Örneğin Doug Lenat'ın doktora tezinde yazdığı ilginç
matematiksel varsayımları otomatik olarak keşfetmeye çalışan AM
programı toplama çıkarmadan başlayıp 1700'lerin matematiğine
birkaç saat içinde geliyordu. Yıllar boyunca matematik düşünebilen
ve ilginç şeyler bulmaya devam eden bir program yok. Uzun süre
çalışan programlarımız proteinlerin fiziksel simülasyonunu yapmak,
interneti indekslemek, uzaydan gelen mesajlarda düzen aramak gibi
monoton, kendini devamlı tekrarlayan işler yapıyorlar. Belki
istediklerimizi eksiksiz ve hatasız şekilde yerine getiren
makinelerimizin bizim hayal edebildiklerimiz ötesinde birşey
yapamaması son derece doğal.

Motivasyon konusunda her ne kadar basit de olsa açıklayıcı
modeller üreten tek grup "ethology" (hayvan davranışı)
bilimi. Tinbergen, Lorenz gibi öncüler en azından basit
hayvanların neyi neden yaptığıyla ilgili mekanik modeller
geliştirmeye başlamışlar. Gallistel'in yazdığı "The Organization
of Action" kitabı uzun zamandır rafimda, bu aralar okuyup ilginç
birşeyler bulursam yazacağım.

İnsanlarda ise bu probleme yaklaşmak ve düşünmek özellikle zor,
çünkü "canım istedi" hissi öyle tatmin edici ki bunun arkasında
birşeyler olduğunu düşünmeye karşı sanki beyinde bir oto-sansür
mekanizması var.

Onur son mesajında bazı bilimsel kavramlarla dini kavramların
temelde farkı olup olmadığına değinmiş. Çok basit bir fark
var. Birini kullanarak gelecekte olacaklara dair (örneğin bir
deneyde) tutarlı tahminlerde bulunabiliyoruz, diğeriyle
bulunamıyoruz (bkz. bilim nedir tartışmamız). Benim
"açıklamak"'tan kastettiğim bu. Örneğin benim motivasyonumu bir
"ruh" modeliyle açıklamaya kalkarsak, sadece soruyu ertelemiş
oluruz - "ruh"'un motivasyonunu nasıl açıklayacağız? Üstelik
fiziksel "ben"'in motivasyonunu birgün açıklayabilme ümidimiz var
- atomlarımı açıklayan kuantumun kuralları kolay anlaşılmaz da
olsa sonuçta bağlayıcı, üstelik virgülden sonra dokuz basamağa
kadar. "Ruh"'un davranışını belirleyen bağlayıcı kurallar
bildiğim kadarıyla yok. Meteryalizm insanların körü körüne
inandığı birşey değil, sadece bu tip sorulara, aynı soruyu bir
adım öteye ertelemeden cevap vermeye çalışan tek alternatif. Neyin
"açıklama" olup neyin olmadığı konusunda iyi anlaşalım.

## May 22, 2007

### A New Kind of Science - Stephen Wolfram

Geçenlerde Stephen Wolfram'ın (Mathematica'nın yaratıcısı) \$25000
ödüllü bir soru koyduğunu duydum web'e (http://www.wolframprize.org).
Bu da beni bir süre önce aldığım ama 1000 sayfa olduğu için
korkudan hiç açmadığım "A New Kind of Science" kitabına bakmaya
ikna etti.

Ana konu her işi yapabilen (hesap anlamında), düşünebileceğiniz
her makineyi simüle edebilen evrensel makinelerin var
oluşu. Fiziksel cisimlerle uğraşmaya alışık dimağlarımız için bu
inanması zor bir durum. Bir makine düşünün, istediğiniz zaman bir
arabayı, istediğiniz zaman bir tornavidayı, istediğiniz zaman bir
telefonu simüle edebiliyor! Bir kere A makinesinin B makinesini
simüle edebilmesi için A'nın daha kompleks bir makine olması gerek
gibi bir sezgimiz var. Eğer bu sezgi doğru olsaydı, evrensel bir
makine mümkün olamazdı, her zaman ondan daha kompleks makineler
tasarlayabilirdik. (Asal sayıların sonsuz olmasının ispatı ile
olan analojiye dikkatinizi çekerim).

Ama Alan Turing 1936'da sanal alemde böyle bir makinenin mümkün
olabileceğini gösterdi. İşin ilginç tarafı Turing'in evrensel
makinesi oldukça basit. Sonsuz bir şerit üzerinde bir daktilonun
kafası gibi ileri geri hareket edebilen yazıcı bir kafa düşünün
(1936'da henüz PC'ler icat olunmamıştı, ama daktilolar vardı). Bu
kafanın yapabildiği tek şey şeritte üzerinde bulunduğu sembolü
okuyabilmek, ve sonlu sayıda kurala göre yeni bir sembol yazıp
sağa ya da sola doğru bir pozisyon hareket edebilmek. Turing bu
makinenin diğer tüm makineleri simüle edebileceğini ispatladı.

Evrensel olan tek makine tabi bu daktilo bozması şey değil, Turing
makineleri sadece evrenselliği ilk ispatlanan (gerçi Godel'in
1931'deki meşhur ispatında da evrensel bir makine gizli olduğu
söylenebilir). "A new kind of science" kitabının üçüncü chapterini
okursanız iyi bir basit makineler listesi ve güzel tarifleri var
(http://www.wolframscience.com/nksonline):

- cellular automata
- mobile automata
- turing machines
- substitution systems
- tag systems
- cyclic tag systems
- register machines
- symbolic systems

Ve tabi bugün kullandığımız tüm bilgisayar dilleri ve CPU'larını
da (sonsuz bir memory ekledikten sonra) bu evrensel makineler
listesine koyabiliriz.

Tüm bu makine çeşitleri içinde belli bir karmaşıklığa ulaştıktan
sonra evrenselliği görmek mümkün - bu evrensel makinelerden
herhangi birisi diğer hepsini simüle edebiliyor. Fakat her Turing
makinesi evrensel değil. Örneğin sadece iki çeşit sembol okuyup
yazabilen, ve sadece iki çeşit kurala göre hareket edebilen Turing
makinelerinin evrensel olamayacağı gösterilmiş. Turing'in orijinal
makinesinde tam kaç sembol ve kaç kural olduğunu bilmiyorum. Ama
1960'larda Marvin Minsky 4 sembol ve 7 kural ile evrensel bir
Turing makinesi yapılabileceğini göstermiş. Uzun zaman basit
Turing makineleri konusu ile kimse uğraşmazken, 1990'larda Wolfram
5 sembol ve 2 kural kullanarak evrensel bir makine
yapılabileceğini göstermiş. Yani bu makinenin kendini içinde
bulabileceği 10 farklı durum var. Bu 10 farklı durum için
bulunduğu pozisyona ne yazacağına ve sağa mı sola mı hareket
yazılmış tüm programları simüle etmek için yeterli! (Sadece
programınızın uygun bir tercümesini makinenin şeridine başta
yazmanız gerek, tabi simülasyonun biraz yavaş olduğunu da akıldan
çıkarmamak lazım).

Ödüllü soru ise Wolfram'ın seçtiği 2 kural ve 3 renkli bir Turing
makinesinin evrenselliğini ispatlamak. Var mı gönüllü?

### Bilinç üzerine bir okuma listesi

Uzun zaman sonra "bilinç" konusunu düşünmeye ve okumaya başladım. Aşağıda küçük bir okuma listesi gönderiyorum ilgilenen için. Ama öncelikle bu konuda "temel soru nedir" sorusuna kendi fikrimce bir cevap yazayım dedim. Belki tatmin edici olmayacak ama konuya bir yerinden girelim.

Galileo ve Newton "evrensel kanunları" keşfetmeden ve Laplace
cinini (Laplace's demon) icat etmeden önce "bilinç" bugün
oluşturduğu kadar ciddi bir sorun değildi sanki. Hayatı özgür
irademiz, isteklerimiz ve inançlarımız cinsinden açıklayabiliyor,
ruhun gerçekliğine maddeninkine olduğundan daha güçlü bir şekilde
bağlanabiliyorduk. Neden olmasın ki, Descartes'in de dediği gibi
en emin olduğumuz şeyler sonuçta kendi düşüncelerimiz,
algılarımız, isteklerimiz - başka herşey yalan olabilir sonuçta...
Bilim konusuna fazla ilgisi olmayan arkadaşlara bilinç konusunu
açmaya çalıştığımda genelde benzer bir tepkiyle karşılaşıyorum:
"problem nedir ki" diyorlar sanki, "ben neyin (irademin,
algılarımın, inançlarımın) gerçek olduğunu biliyorum, senin
bilimin bunları henüz açıklayamıyorsa bu bilimin sorunu..."

Tabi bilimsel olarak kimse daha bilinç ve benzer psikolojik
kavramları beynin fonksiyonları olarak açıklayabilmiş değil. Bazı
filozoflar temelde böyle bir açıklamanın mümkün olabileceğine
karşılar. Fakat yine de sonuçta herşeyin fiziğin temel kanunları
cinsinden açıklanabileceğine dair yaygın bir inanç var. Fiziğin
temel kanunları ise dışarıdan bir etki kabul etmiyor (enerjinin
korunumu vb). Bu durumda bizim birşeyleri kontrol ediyor olmamız
(ki en güçlü inançlarımızdan biri), imkansız gibi
görünüyor. Sonuçta hepimiz atomların köleleri durumundayız
gibi. Ama ben hiç de kendimi atomların oraya buraya ittirdiği bir

Bu durumdan çıkmak için iki yol var. Birincisi irade, algı, istek
ve inançların temel oluşundan taviz vermemek. Bir ihtimal fiziği
modifiye etmemiz gerekebilir (Penrose metodu). Ya da fiziğin
boşlukları içinde (kuantum belirsizlik, kaos, vs) iradeye yer
aramak. Ya da "bilinç" diye madde ve enerji dışında yeni bir
temel olgu olduğunu varsayıp baştan başlamak.

İkinci yol ise fizikten taviz vermemek. Bu pozisyondaki insanların
yapması gereken ise "nasıl oluyor da bize böyle gözüküyor"
sorusuna cevap vermek. Mesajın sonundaki listede verdiğim tüm
kitaplar bu ikinci yolu takip etmeye çalışan insanların
yazdıkları. Ama o zaman gerçekten atomların oraya buraya ittirdiği
bir bulutsam (ya da Sagan'ın deyimiyle yıldız tozu isek) --

1. Nasıl oluyor da ben istediğim zaman kolumu oynatabiliyorum?

2. Kırmızı bana belli bir şekilde gözüküyor, beynimin her detayını
bilseniz bile kırmızının bana nasıl gözüktüğünü anlayabilir
misiniz?

3. İnsanların (ve bazı diğer canlıların) davranışlarını "istek",
"inanç", "irade" vb kavramları ile hayli iyi açıklayabilmek ve
tahmin edebilmek nasıl mümkün. "istek" gibi kavramların fiziksel
karşılığı nedir?

4. Aynı benim gibi tepki veren bir robot yapsak onun da içinde
dışarıya bakıp gerçekten birşeyler gören "birisi" olur mu? Yoksa
içi boş olabilir mi? Bu robota sen bilinçli misin desek, o da
benim gibi evet dese, yalan mı söylüyor olur?

5. Eğer hepimiz yıldız tozu isek bir grup tozun diğer grup tozu
yok etmesi, acı çektirmesi (zaten acı ne demek?) gibi konularda
niye bu kadar hassasız? Ki böyle kötü şeyler olduğunda, oraya
buraya ittiren atomlar sorumlu ise ceza sistemi saçma değil mi?

6. Ben niye kendimi isteyen, yapan, belli şeylere inanan bir
varlık gibi hissediyorum?

Bu listeyi uzatmak mümkün. Ama dikkat ederseniz hemen her soru
Laplace'in mekanik evreni ile psikolojik ve ahlaki kavramlarımızın
çatışmasından türüyor. En son (ve bence en önemli) soru için
Wittgenstein'ın bilinen bir anektodu gayet aydınlatıcı bence:

Wittgenstein bir arkadaşına sormuş: "Hep merak etmişimdir,
insanlar niye bu kadar uzun zaman güneşin dünyanın etrafında
döndüğünü sanmışlar?" Arkadaşı cevap vermiş: "Neden olacak,
buradan bakınca öyle gözüküyor da ondan..." Wittgenstein'ın
cevabı: "Peki dünya kendi etrafında dönüyor olsaydı nasıl
gözükecekti?"

Peki hepimiz gerçekten trilyonlarca molekülün itişmesinden oluşan
karmaşık yaratıklar olsaydık, kendimizi nasıl hissedecektik?

======
Okuma listesi:

[1] Brainstorms, Daniel C. Dennett 1978
yazar değil, ama sanki fuzzy konularda doğru yönlere ışık tutuyor
gibi.

[2] Godel, Escher, Bach, Douglas Hofstadter 1979
- AI'cıların pek övdüğü klasik eser. Okuduğum en güzel tasarlanmış
(estetik olarak) kitaplardan biri (diyaloglar, anagramlar,
resimler). Tercümesi çok zor olmasına rağmen birileri uğraşıp
duydum. Her fırsatta bir iki tane alıp öğrencilerime hediye
ediyorum :)

[3] Mind's I, Hofstadter and Dennett 1981
- Okuması kolay güzel hikayelerden (philosophical fiction?)
oluşuyor - bir bilinç mesnevi'si denebilir - bence grubun mecburi
okuma listesinde olmalı ki kompleks fikirleri anlatmak için bi
sürü uğraşacağımıza bu hikayelere gönderme yapabilelim.

[4] Social Brain, Gazzaniga 1987
- Bilincin aslında göründüğü gibi olmadığını (?) en güzel gösteren
neuroscience öykülerinden biri. En büyük problem bilinci herkes
anladığını hissediyor, gerçekten anlama yolunda ilk yapılması
gereken bu "sanılan" anlayışın böyle güzel deneylerle yavaş yavaş
iyi bir başlangıç.

[5] Consciousness Explained, Daniel C. Dennett 1991
bir araya koymuş. Ama tabi "bilinç nedir?"->"bilinç ilüzyondur",
"qualia nedir?"->"qualia yoktur" gibi cevaplardan oluştuğu için
rakipleri kitaba "Consciousness explained away" ismi vermişler.

[6] Consciousness: A Very Short Introduction, Susan Blackmore 2005
- Diğerlerini çok güzel özetleyen kısa ve güzel bir
başlangıç. Sadece bir kitaba bakabilecekseniz bunu okuyun derim.

[7] Good and Real, Gary L. Drescher 2006
değil, aynı zamanda ahlaki da açıklayabileceğimizi savunan iddialı
bir eser.

[8] I am a Strange Loop, Douglas Hofstadter 2007
- Bunu yeni okuyorum daha, ama Hofstadter'in kendi deyimiyle [2]
GEB'de anlatmak istediklerini tam anlamayanlar için yazılmış bir
kitap.

## April 11, 2007

### Locally Scaled Density Based Clustering

Ergun Biçici and Deniz Yuret. In International Conference on Adaptive and Natural Computing Algorithms (ICANNGA 2007), LNCS 4431, Part I, Springer-Verlag. (PDF, PS, Presentation, Code, Readme)

Abstract: Density based clustering methods allow the identification of arbitrary, not necessarily convex regions of data points that are densely populated. The number of clusters does not need to be specified beforehand; a cluster is defined to be a connected region that exceeds a given density threshold. This paper introduces the notion of local scaling in density based clustering, which determines the density threshold based on the local statistics of the data. The local maxima of density are discovered using a k-nearest-neighbor density estimation and used as centers of potential clusters. Each cluster is grown until the density falls below a pre-specified ratio of the center point’s density. The resulting clustering technique is able to identify clusters of arbitrary shape on noisy backgrounds that contain significant density gradients. The focus of this paper is to automate the process of clustering by making use of the local density information for arbitrarily sized, shaped, located, and numbered clusters. The performance of the new algorithm is promising as it is demonstrated on a number of synthetic datasets and images for a wide range of its parameters.

## March 19, 2007

### Drexler quote

It makes sense to think in terms of three levels of knowledge about a field:

1. Knowing what a field is about—knowing what sorts of physical systems and phenomena it deals with, and what sorts of questions it asks and answers.
2. Knowing the content of a field in a qualitative sense—having a good feel for what sorts of phenomena can be important in what circumstances, and knowing when you need answers from work in that field.
3. Knowing how to get those answers yourself, based on personal mastery of enough of the field's subject matter.

If one has enough knowledge at levels (1) and (2) in enough fields, then one can steer clear of problems in those fields while doing work in a related field where you have knowledge at level (3). And this is a good thing, because knowledge at levels (1) and (2) takes far less time to acquire. But to make proper use of knowledge at levels (1) and (2) requires a harsh discipline: attempt to assume the worst about what you don't know. Don't assume that a poorly-understood physical effect will somehow save your design; do assume (until finding otherwise) that it may utterly ruin it. Without this discipline, you'll become an intellectual hazard. With it, you'll be able to make a real contribution.

## March 11, 2007

### Technology and society

Two interesting articles on technology and its impact on society. I received the first one from Ergun Bicici. The second one came from Emrah Çevik.

U.C. Berkeley Graduation Speech by Peter Norvig
Economic Possibilities for our Grandchildren by JM Keynes (1930)

## February 22, 2007

### Bilinç üzerine

Kavram uzayımızda "göründüğü gibi olmama" ödülleri dağıtılsa
herhalde "bilinç" birincilik ödülüne layık görülürdü.

İnsanın aklına gelen "peki bilgileri algılayan kim" sorusu belki
bilinci anlama konusunda bizi en çok uğraştıran soru. İçeride
algıları izleyen bir küçük bilinç (homunculus) varsaymak yetmiyor
çünkü o zaman aynı soruyu küçük bilinç için sormak gerek. Kaldı
ki beyin hakkında bildiklerimiz algıların yakınsandığı bir merkez
fikrini pek desteklemiyor. Desteklese bile bu tatmin edici
olmazdı: beyninizi açıp, işte şuradaki nöron kümesi senin bilincin
deseler bu size çok şey ifade etmeyecek. Örneğin şu an dinlediğim
müziğin işitme merkezimde bir seri nöron tıklaması olarak kod
edildiğini biliyorum ama bu duyduğum müziğin öznel kalitesini
açıklamaya yetmiyor.

Bir alternatif bilinci beynin dışında aramak. Tüm büyük dinler
(Budizm dışında sanırım) manevi bir dünyanın gerçekliğini kabul
ediyor. Bu dualist pozisyon Descartes'tan beri pek çok ciddi
filozof tarafından da savunuldu. Başka konularda düşüncelerine çok
saygı duyduğum Popper bu akımın en son örneklerinden biri
[3]. Buradaki en büyük problem artık her ciddi filozofun enerjinin
korunumu gibi fiziksel kanunlara inanması ve bu da fizik dışında
bir olgunun fiziksel cisimleri (elimiz kolumuz gibi) etkilemesini
imkansız hale getiriyor.

Tabi bu "bilinç" meteryali her ne ise illa fizik dışında olmak
zorunda değil. Dualist pozisyonu korumak için fiziği genişletmeye
ve fizik içerisinde bilinç için yer açmaya çalışanlar da var. Bu
forumda da bahsi geçti daha evvel, örneğin Roger Penrose [4].

Bugünkü bilgimiz ile en tutarlı pozisyon ise bilinç için yeni bir
meteryal aramak yerine bilincin beyindeki işlemlerden ibaret
olduğunu kabul etmek olurdu. Malesef bu meteryalist yaklaşımın
önündeki en büyük psikolojik engel başta sorduğumuz soru: "peki
bilgileri algılayan kim"...

Bu konuda yazılmış çizilmiş çok şey var. Özellikle [1], [2] ve
[5]'i tavsiye ederim. Bilincin bir ilüzyon olduğuna insanın
kendini dürüstçe inandırabilmesi kolay değil (bu arada ilüzyon
"yok", "yalan" anlamında değil, sadece "aslı göründüğü gibi değil"
anlamında).

Başka konularda ilüzyonları yıkmak bir nebze daha kolay. Örneğin
şu an görme duyum bana masami, bilgisayarımı, telefonumu, rafları,
kısaca odamın bu köşesindeki hemen her şeyi rahatça algıladığımı
söylüyor. Fakat basit bir deneyle aslında başparmağımın tırnağı
kadar bir alan dışında hemen hemen kör olduğumu görmem
kolay. Yemekte pilav mercimek vs yediğiniz bir gün, tabağınızda
son kalan tanecikleri tek gözünüzü kapatıp diğerini belli bir
noktaya sabitleyerek saymaya çalışın - ne kadar imkansız olduğunu
göreceksiniz. Ya da eğer partneriniz razı olursa deneyin:
bacağınızın arkasına tek bir sivri cisimle mi dokunulduğunu yoksa
aynı anda aralarında 2-3 cm olan iki farklı cisimle mi
dokunulduğunu ayırdedemeyeceksiniz.

Eğer Memduh haklı ise ve bilinç de zamanı algılamamızı sağlayan
bir duyu oranıysa belki yukarıdakilere benzer ilüzyon yıkıcı
testler tasarlayabiliriz, ne dersiniz?

Susan Blackmore meditasyon tavsiye ediyor. Benim için dönüm
noktası bir gün bana tüm dürüstlüğü ile "evet ben bilinçliyim"
diyecek bir robot yapabileceğime inanmam oldu. Etrafında olup
bitenleri algılayıp hatırlayabilen; başka canlıları, onların niyet
ve isteklerini modelleyebilen; neler yaptıklarını hatırlayıp neler
yapacaklarını tahmin edebilen bir robot, bu analiz gücünü kendi
üzerine çevirdiğinde ne görür tahmin edersiniz? Acaba onu mekanik
bir robot olduğuna inandırabilir miyiz kolay kolay?

[1] Consciousness, A Very Short Introduction -- Susan Blackmore
[2] Consciousness Explained -- Daniel C. Dennett
[3] The Self and Its Brain -- Karl Popper and John Eccles
[4] The Emperor's New Mind -- Roger Penrose
[5] Mind's I -- Douglas Hofstadter and Daniel Dennett

## February 21, 2007

### Akıllı Tasarım

Bugüne Berkin'in gönderdiği Türkiye'nin ilk "Akıllı Tasarım"
(intelligent design) konferansı ile ilgili mesajı okuyarak

http://www.mustafaakyol.org/archives/2007/02/post_1.php

İstanbul Büyük Şehir Belediyesini tebrik ediyorum. Duyduğum
tepkinin kaynağı akıllı tasarım fikrinin yanlış olduğunu
düşünmemden ziyade, bu tip anti-bilim propogandayı bile
Amerika'nın "bible belt"'inden ithal ediyor olmamız. Bir otomobil
tasarımı yaparken orijinal olamamamızı anlıyorum, McDonalds ve El
Torito'yu İstanbul'a getirmemizi anlıyorum, ama sofu fikirler
üretirken lütfen Amerika'nın yardımına ihtiyacımız olduğunu
söylemesin kimse. Bu işte onlardan yüzlerce yıl daha tecrübeliyiz,
daha yaratıcı olabiliriz...

Son fiziksel toplantımızda Memduh tüm bulguları kabul edip açık
fikirli olmasına rağmen evrime inanmakta güçlük çeken birine nasıl
bir argüman sunulabileceğini sorgulamıştı. Bu konuda okuduğum
kitaplar malesef çoğunlukla karşı tarafın straw-man argümanlarıyla
uğraşıyor. Bir yandan aklımı bu kurcalamakta.

Tabi bu arada ders saatim geldi 60 mühendislik öğrencisine
olasılık öğretmek için (günün konusu permutasyon, kombinasyon)
girdim sınıfa. Tahmin edebileceğiniz gibi dersin yarısından
fazlası evrim konusundaki örneklere gitti :)

Şu maymunların rastgele daktilo tuşlarına basarak Shakespeare
yazma örneğini ele alalım. Genelde canlılarda görülen karmaşık
yapıların (göz, beyin, karaciğer) "rastgele" ortaya çıkması bizim
maymunların edebi yetenekleriyle karşılaştırılır. Bu argümandaki
ilk zayıf noktayı öğrenciler hemen buldu: dünyada çok sayıda canlı
çok uzun zamandır evrim oyununu oynamaktaydı bu iş bir iki
maymunun işi gibi değildi. Fakat bu zayıf nokta çökertici
değil. 100,000 harflik tipik bir Shakespeare oyununu rastgele
yazma ihtimali 30^-100,000 gibi bir sayı ki evrenin tüm
parçacıkları big-bang'den beri bu işle uğraşıyor da olsa
yetmiyor. Zayıf nokta maymunların rastgele tuşlara bastığı
varsayımı. Evrimde yanlış tuşa basan maymun vefat ediyor, yazdığı
harf siliniyor, yerine başka maymun geçiyor. Ta ki doğru tuşa
basana kadar. Bu şekilde her harf için ortalama 30 maymun telef
etsek, 3 milyon maymun bir Shakespeare oyununu rahat yazar. Bu
kadar maymun İstanbul'da bile rahatça bulunur.

Fakat tüm bunlar boş. Esas nokta bu değil. Evrime inanmak
istemeyen bir insan bence ikna edilemez. Daha basit bir örnek
vereyim. Eğer ben evrenin bir gün önce başladığına inanmak
istiyorsam kimse beni bunun aksine ikna edemez. Hatıralarım mı?
Ben beynimde o hafızalarla üretildim. Dünyadaki fosiller mi? O
kayalar da o şekilde üretildi. Karbon tarihleme mi? Dün bu işleri
yapan zat o cisimlerdeki C-12 ve C-14 oranını o şekilde
ayarladı. Söyleyebileceğiniz hiçbir şey evrenin dünden önce var
olmadığını ya da yarından sonra var olmayacağını ispatlayamaz bana
- çünkü mantıksal bir delik yok varsayımımda.

Peki o zaman neden olası varsayımlardan biri (evren 14 milyar yıl
önce big-bang ile başladı vs vs) bana diğer varsayımlardan daha
çekici geliyor? Geçenlerde bu forumda bilim nedir ne değildir
(yanlışlanabilirlik vs) konusunu tartıştık. Fakat
yanlışlanabilirlikten kime ne? Benim ne zorum var her söylediğim
yanlışlanabilir olsun diye yırtınayım? Niye bu düşünce sistemini
diğer düşünce sistemlerine tercih ediyorum?

Sonunda iki sebebi olduğuna karar verdim.

Birincisi merak ve estetik. Küçüklükten beri ne nasıl çalışır
merak ediyorum. Canlıları cansızlardan ayıran nedir, dinazorlar
nereden geldi, nereye gittiler, genler nasıl çalışır, hücrelerin
içinde ne olup bitiyor, tüm bileşenlerini bir araya getirsek canlı
hücre yapabilir miyiz - bunların hepsi birbirine bağlı kafamda
dönen sorular. Birilerinin gelip bana böyle oldu çünkü akıllı biri
böyle tasarladı demesi, "bu soruyu sorma" demesinden farklı değil
- dolayısıyla merakımı tatmin etmiyor - ayrıca fikir olarak bir

İkinci sebep fayda. Bilimsel yöntemi kullanarak problemleri
çözebileceğimize, hastalıkları yenebileceğimize, doğayı daha iyi
kontrol edebileceğimize vs. inanıyorum. Hatta bu fikir bana evren
dünden önce var olmasaydı bile cazip geliyor. Sadece evrenin
yarından sonra da var olacağına inanmam merak etmem ve
problemlerle uğraşmam için yeterli. Gerçi yarın evren sona erecek
olsa da matematikle uğraşırdım herhalde.

İşin ilginç tarafı eğer bu son söylediğim doğru ise bilime ve
dolayısıyla evrime inanmayanların teknolojide, tıpta geri kalması
ve evrim kanunlarına göre elimine olmaları beklenebilirdi. Tabi
kurduğumuz sosyal düzen insanların bilimsel inançlarıyla çocuk
yapma sayılarını pek correlated tutmuyor. İronik olurdu öyle
olsa. Sanki evrime inanmayanları evrim cezalandırıyormuş gibi...

Tabi bu da akıllı tasarımcılara karşı ikna edici bir argüman
değil. Çünkü onlar da benim gibilerin pek yakında kutsal bir
felaket sonucu ortadan kaldırılacağına inanıyorlar :)

### Reckoning with Risk by Gerd Gigerenzer

This is the last one of the new breed of "innumeracy" books I have read. Not one of the better ones. Still ok for generating examples for my probability course. Other examples of the genre include:

- Innumeracy by Paulos
- A mathematician reads the newspaper by Paulos
- How to lie with statistics by Huff

Not in the same genre but the following books on the history of probability and statistics may also be useful for stories and examples:

- Games Gods and Gambling by David
- The Probabilistic Revolution by Kruger et.al.
- The History of Statistics by Stigler.
- The Lady Tasting Tea by Salsburg.

The website
http://www.planetqhe.com/beta/information/home.htm contains nice problems and animations as well as some book recommendations and links.

- The Magical Maze by Ian Stewart
- Chance Rules by Brian Everitt
- Inevitable Illusions by Massimo Piattelli-Palmarini
- Taking Chances by John Haigh
- Fifty Challenging Problems in Probability by Frederick Mosteller
- Cartoon Guide to Statistics by Gonick and Smith

Full post...

## February 14, 2007

### Sigara ve özgürlük

Amartya Sen'in Unrestrained smoking is a libertarian half-way house makalesi üzerine düşünceler...

Libertarian söylem içinde delik bulacağım derken, Amartya Sen'in
argümanlarında içinden tren geçirilecek boşluklar bırakması ilginç
tabi. Bu boşluklardan biri Emre'nin bahsettiği Platon limitinden
bahsedilmeden geçmiş gelecek ben çatışmasına devletin burnunun
sokulması, bir diğeri de herkesin sosyal yardımı bir takım
özgürlüklerin kısıtlanmasına tercih edeceğinin tartışmasız
varsayılması (hatta alternatif "monstrously unforgiving society"
olarak betimlenip bu konuda çözüm düşünmek isteyebilecekler baştan
damgalanmış).

Devlet eğer sosyal bir anlaşmanın ürünü olacaksa beni neden
koruyup neden kormayacağına da ben karar verebilmeliyim. Korunmak
istediğim ölçüde özgürlüklerimin kısıtlanmasına da
katlanmalıyım. Örneğin beni kimsenin öldürmesini istemiyorsam
devleti de beni koruması için görevlendirdiysem, benim de
başkalarını öldürme özgürlüğümün elimden alınmasına ses çıkarmamam
gerek. (Tabi farklı insanlar farklı şeylerden korunmak
isteyecekler, durum karışacak vs.)

Peki gerçekten gelecek benleri bugünkü benden korumasını istiyor
muyum ben üçüncü bir kişinin? Bu bana bir kabus gibi
geliyor. Bugünkü ben geçmiş beni yaptığı bütün hatalarla (belki
özellikle hatalarla) kabul edip seviyor. Birileri bana onu yapma
bunu yapma deyip beni "korusaydı" karşınızda bu Deniz olmayacaktı
(ana babama sorun, çok denediler, dinletemediler). Farklı
ben'lerin farklı bireyler olarak kabul edilmesine bir itirazım
yok, ama lütfen aramızdaki anlaşmazlıklara başkası
karışmasın. Hatta biraz daha ileri gidersek, ben yakın ailem ve
yakın arkadaşlarım arasındaki anlaşmazlıklara da devletin
karışmasını tercih etmem, en azından o durumlarda uygulanan
kurallarla hiç tanımadığım insanlarla aramdaki ilişkilere
uygulanan kuralların aynı olması bana saçma geliyor.

Sosyal konularda ne zaman basit prensiplere (aksiyomlara) dayalı
çözümler üzerinde düşünmek istesem, bu konulara benden çok kafa
yoran bazı tanıdıklarım konunun böyle basit olmadığını, insanların
karmaşık olduğunu, benim basit teknik kafamla düşünüp önerdiğim
çözümlerdeki delikleri işaret ederek gösterir, naivete'me
gülerler. Ben yine de sosyal konularda basit prensiplerden
çıkarılan ideallere aşımtotik olarak yaklaşmaya çalışmanın, fuzzy
güt feeling'lerle prensipsiz bir çözüm karmaşası üretmekten iyi
olduğu düşüncesindeyim. Doğru çözümlere bu şekilde daha kolay
ulaşabileceğimizi ispatlayamasam bile en azından neyin doğru neyin
yanlış olduğunu tartışabileceğimiz bir asgari müştereğimiz olur.

John Stuart Mill'in düşünceleri, ve Libertarianism benim bu
anlamda anlayabildiğim idealler. (Libertarian'lar sosyal yardım
sorunları çıksa da oturup düşünebiliyorum nasıl bu ideallerle
consistent başka bir çözüm olabilir diye. Birisi Amartya Sen'in
aksiyomlarını bana anlatabilir mi?

## February 12, 2007

### Permutation City - Greg Egan

Değişik "varoluş" çeşitlerinin virtiyozu Greg Egan'dan Permutation
City adında bir sci-fi hikaye okudum. Aslında Egan'a science
fiction demek haksızlık, philosophical fiction demek
lazım. Varoluş çeşitleri derken aklınıza gelebilecek biyolojik,
sentetik, yazılımla simüle edilmiş, ya da simüle edilen bir
evrenin içinde evrimleşmiş her türlü bilinç ve bunların
aralarındaki etkileşimler, kopyalamalar, zamanı algılamalarındaki
görece farklar doyasıya işlenmiş. İlgi çekebilecek birkaç konu:
1000 yaşına gelen insanlar long-term hafızalarındaki yavaşlamayla
nasıl başa çıkarlar, sizden 17 kat yavaş run edebilen
simülasyonunuzla nasıl iletişim kurarsınız, siz simülasyonu geriye
doğru çalıştırdığınızda simülasyondaki bilinçlerin zaman kavramına
ne olur, peki simülasyonu durdurduğunuzda o bilinçler için zaman
durur mu?

### Fikir incileri

Her konunun (okulda nefret ettiğim konuların bile) sahip olduğu
fikir incileri olduğuna inanıyorum. Okul hayatımızda iç bayıcı
mantıksal sıra yerine bu incilerle beslesek çocukları eminim pek
çok konuda daha meraklı olacaklar. Tabi hangi konunun derinine
inmek istese incilerle dolu kıyıyı aşıp güç bela yürünen
bataklıklarla uğraşması gerekiyor insanın. Ama bir kere incilerin
tadını almış olan insanlar için bu daha dayanılır bir işkence.

Ortaokulda arkadaşım Ünal, abisinden öğrendiği Pisagor teoreminden
bahsetmişti bigün otobüste giderken. İlk tepkim "hadi canım" oldu.
Matematikçiler her dik üçgen için a^2 + b^2 = c^2 olduğunu nasıl
bilebilirlerdi? Bir kere her üçgeni çizip ölçmeleri gerekirdi ki
bu da sonsuza kadar sürerdi. Üstelik bir sonraki üçgenin kurala
uymayacağını nasıl bilebilirdi kimse? Yani kısacası inanmadım.
Sonra tam olarak ne zaman bilmiyorum ama matematikçilerin sonsuz
sayıda cisimle ilgili kesin iddialarda bulunabildiklerini anladım.
Çok daha sonra da aslında matematiği diğer uğraşılardan ayıran
temel özelliğin bu olduğunu ve başka hiçbir şeyden emin

Neden anlatıyorum bu hikayeyi? Çünkü o günden beri benim "inci"
olarak nitelendirdiğim fikirlerin özelliklerini çok güzel
örneklendiriyor. Bir inci ilk duyulduğunda yarattığı tepki "hadi
canım" olmalı. Bu his güzel bir sihirbazlık numarası gördüğümde
hissettiğim incredulity'nin aynısı. Te sonra bir zaman "Aha"
dedirtmeli anlayınca. Bilim ve matematikte doğru ve faydalı
fikirler çok, ama aralarında insana önce "hadi canım" sonra "aha"
dedirtenler az.

Bu açıdan bakınca bazı sorulara kesin cevaplar verebilmemiz de
prensipte cevaplanamayacak sorular olması da bana ilginç geliyor.
Bilincin, özgür iradenin, ve zaman akışının birer ilüzyon olabilme
olasılığı ilginç geliyor. Elektronların izlenip izlenmediklerine
göre bir o delikten bir bu delikten geçmeleri ilginç geliyor.
Dünyadaki elmalarla uzaydaki yıldızların aynı atomlardan oluşup
aynı yerçekimi kurallarına uyması ilginç geliyor. Sonlu sayıda
veriye bakıp, daha önce görmediğimiz şeylerle ilgili tahminlerde
bulunabilmemiz, kısaca "öğrenmenin" ve "bilimin" mümkün oluşu ve
limitlerinin ne olduğu ilginç geliyor. Borsada hayat boyu para
kazanamasam da, para kazanılamayacağını ispatlamanın imkansız
oluşu ya da pokerde her ne kadar hesaplayamasam da hiçbir zaman
para kaybetmeyecek bir stratejinin var oluşu ilginç geliyor. Kedi
yavrularının beyinlerinde görmelerini sağlayan center-surround
hücreler bulunması, daha sonra bunların aslında iki boyutlu türev
aldıklarını anlamamız, daha sonra rastgele bağlanmış bir sınır
ağına rastgele resimler gösterildiğinde hücrelerin basit bir
kuralla kendilerini bu şekilde bağlayabildiklerinin gösterilmesi
bana çok ama çok ilginç geliyor.

Ve bunların hiçbiri okullarda çocuklara anlatılmıyor! Tüm bu
bahsettiklerimi ilgili bir ortaokul öğrencisine çok rahat
anlatabileceğimi biliyorum. Bunu bildiğim halde Bilim ve Teknik
dergisinin mesajlarıma cevap yazmaması, Tübitak'ın kitap
çevirileri yaparken fikrimizi sormaması, kendi çocuğuma ne
öğretileceğine karar verecek milli eğitim bakanlığının zavallı
insanlar tarafından yönetilmesi, çalıştığım okulun öğrettiklerimin
kalitesiyle de içeriğiyle de pek ilgilenmemesi, ve ıvır zıvır
işlerle uğraşırken zamanın geçip gitmesi beni çileden çıkarıyor.