Jump to content

MediaWiki talk:Dictionary.js

Add topic
From Wikisource
Latest comment: 13 years ago by Phe in topic Bugs

The following program generates indexes for dictionaries:

#!/usr/bin/python

import os
import thread, Queue, time, socket,string,re,sys
import wikipedia, pagegenerators, catlib

def do(lang,filename,rangefrom,rangeto):

        site = wikipedia.getSite(lang,fam='wikisource')
        wikipedia.setSite(site)
        wikipedia.setAction("dictionary")

        entry = ""
        pgz = []
        out = ""

        for i in range(rangefrom,rangeto):
                pagenum = i
                print i

                page = wikipedia.Page(site, "Page:"+filename+"/%d"%(pagenum) )
                txt = page.get()

                fp = re.compile(r"<section\s*begin=(\"|)(.*?)(\"|)\s*/>(.*?)<section\s*end=(\"|)(.*?)(\"|)\s*/>",re.S)
                for item in fp.finditer(txt):
                        n1 = item.group(2)
                        n3 = item.group(6)
                        #print [n1]
                        if(n1 == n3):
                                if entry and n1 == entry :
                                        pgz.append(pagenum)
                                else:
                                        if entry:
                                                n_from = pgz[0]
                                                if len(pgz)>1 :
                                                        n_to = pgz[-1]
                                                else:
                                                        n_to = n_from

                                                line = "*[[DL#%d:%d|%s]]"%(n_from,n_to,entry)
                                                print line
                                                out = out + line + "\n"

                                        entry = n1
                                        #print "entry:",entry
                                        pgz = [pagenum]
                        else:
                                print "mismatch:", pagenum, n1, n3

        page = wikipedia.Page(site, "User:ThomasBot/dict/"+filename)
        page.put(out)



if __name__ == "__main__":
        #do('en',"A Dictionary of Music and Musicians vol 1.djvu",13,782)
        do("fr","Diderot - Encyclopedie 1ere edition tome 2.djvu",5,875)

Bugs

[edit]

moved from Wikisource talk:ProofreadPage

Dynamic dictionary has a few trouble:

  1. callback is called when readyState != 4 so the callback terminate with error
    Fixed, trivial change
  2. when a dynamic article contains links to the same volume of article, onClick() handler is not installed and the link doesn't work
    Fixed by recording a map of self-link [1]
  3. the way links are created on the left frame is not compatible with lupin popup, it's not surprising than lupin popup doesn't works on these links as they are generated dynamically (and I don't plan to fix it) but at least moving the mouse over these links shouldn't generate a javascript error.
    First I tried to change the item.href setting by a $(item).click(function (event) { event.preventDefault(); show_dict_entry(index, m_from, m_to, title); }); Second I tried to setup a nopopup attrib to each href, none worked. No idea why the first didn't work, the second failing is normal as lupin popup only check for this attrib after the page load, not at mousehover time. So I know only a workaround: instead of <div id="dynamic_links" use a <div id="dynamic_links" class="nopopups"
  4. the script is too slow, it handle poorly page containing 3500 dynamic articles.
    Later, perhaps a better use of css selector can speedup this script. example of slow to load page, 3840 dict entry.
    Misidentified trouble, slowdown came from another gadget specific to fr:, but anyway I've applied the change [2].

Phe 16:24, 17 September 2011 (UTC)Reply