Disputatio Usoris:Secundus Zephyrus/1000 paginae sizes

E Vicipaedia
Jump to navigation Jump to search

Substitutions[fontem recensere]

As of 23 July 2010 (and undoubtedly some time before), en:Calculus, en:Colon, en:Lever, en:Nabokov, and en:Pulley were deleted and en:Hans Christian Andersen, en:Large intestine, en:Machine, en:Mathematical Analysis and en:Robot were added. IacobusAmor 21:53, 24 Iulii 2010 (UTC)[reply]

Don't worry. This script automatically runs with the most up-to-date 1000 list. --SECUNDUS ZEPHYRUS 23:54, 24 Iulii 2010 (UTC)[reply]
That's good! In case you haven't noticed, I've been beefing up some of our articles on this list. We should get a noticeable boost in the next computation. ¶ Incidentally, to be safely above a given cutoff, an article should have at least 2500 more characters than it seemingly needs. For example, Lingua Anglica had 12,068 characters during the last computation, but it didn't clear the 10K barrier. That's because interwiki links (and perhaps other things) don't count. IacobusAmor 00:25, 25 Iulii 2010 (UTC)[reply]
Yes, hidden text also doesn't count. But SZ's script deals with that, and also with interwiki links. I think you are maybe reading your figure of 12,068 from the history page: those figures are "gross" (in a technical sense), swollen by hidden text and interwiki links. Andrew Dalby (disputatio) 10:22, 25 Iulii 2010 (UTC)[reply]
Yes, I was using the "gross count" given by the history page, which, for Lingua Anglica, counts 12 068, as against 12 140.7 counted by SZ's script. My recent changes have raised Lingua Anglica to 12 426, and (unless someone raises it more) we'll see how that turns out in the next computation. IIRC, the keepers of the official count once estimated the interwikis to be worth about 1000 characters, but that estimate now seems low; of course the true number will differ according to the number & identity of the wikis listed, and we'd expect it to be much higher for the 1000 selected paginae than for articles on, say, Garrett Birkhoff and Curtis A. Price. IacobusAmor 11:44, 25 Iulii 2010 (UTC)[reply]
I'll run the script again right now, and we can see what it has to say about the changes we've made so far. --SECUNDUS ZEPHYRUS 17:38, 25 Iulii 2010 (UTC)[reply]
Also, I think you might be missing the point of this list. My script is supposed to automatically count the interwikis/comments and subtract them from the total characters per page. But, I think there is definitely something wrong with the script, because it doesn't seem to be doing that. I have a hunch why, and I will investigate. --SECUNDUS ZEPHYRUS 19:39, 25 Iulii 2010 (UTC)[reply]
Yes, I figured out that your script wasn't correctly subtracting the interwikis etc., but nevertheless it's quite useful for collecting in one place an approximation of the gross counts and permitting them to be put in ascending or descending order. IacobusAmor 20:08, 25 Iulii 2010 (UTC)[reply]
I agree with that! Andrew Dalby (disputatio) 21:02, 25 Iulii 2010 (UTC)[reply]

Fixed[fontem recensere]

Okay! I figured out the problem with the script, and I have updated all the numbers. These numbers (and the numbers from here on out) are the lengths of the articles minus the interwikis/comments times the language weight (Latin is 1.1). Notice, for example, that English Language has dropped to 8011.3! I noticed that some articles had a lot of hidden text, such as Katsushika Hokusai. Any questions, let me know! (next on my agenda is to find out how to make those links to the Latin pages...) --SECUNDUS ZEPHYRUS 02:54, 26 Iulii 2010 (UTC)[reply]

Interesting, and we'll soon find out how well it predicts the results of the official computation (probably scheduled for next weekend). Most of the results I checked seem plausible, and the table reveals a few articles that we can easily & quickly improve. Note that, for Nederlandia, the gross count is 13085, but your count is 9321; dividing 13085 by 1.1 gives us 11896, implying an ignored total of 2575 (because of interwikis & comments)—really close to the 2500 that I'd suggested might be typical. Similarly, Civitas Vaticana has a gross count of 11513, which divided by 1.1 = 10466; your count is 8230, implying an ignored total of 2236—again in the same ballpark. Macte, Zephyre! IacobusAmor 03:24, 26 Iulii 2010 (UTC)[reply]

Count not up to date?[fontem recensere]

You say the count was made on 25 July 2010, but the texts it was measuring weren't up to date, as it's showing, for example,

Mongol Empire 0.0
Ming Dynasty 0.0
Tang Dynasty 0.0

but I created each of those articles last week, with approximate gross counts of 12820, 14257, and 22656 respectively. (In the next official count, that'll gain us 0.12 points because each is above 10K.) Also, I created our versions of Hip hop music and Russian Revolution (1917), at 6382 & 5219 words respectively (and therefore in the official count worth 0.02 points together), but your tabulation shows them at zero. Can you make the script use the latest versions of texts? IacobusAmor 11:58, 26 Iulii 2010 (UTC)[reply]

I guess there's bound to be some timelag? For the record, the Latin links were added to the respective English pages on 18, 19 and 21 July. Andrew Dalby (disputatio) 12:42, 26 Iulii 2010 (UTC)[reply]
I know what the problem is. The script caches every English article on my hard drive to expedite the process (which can take many hours if you are running every language). I didn't realize that it was searching those pages for the interwiki links. I will re-cache and re-run the program. I've also altered it to link directly to the Latin pages! Get excited!!!! --SECUNDUS ZEPHYRUS 22:28, 26 Iulii 2010 (UTC)[reply]
I am excited :) Andrew Dalby (disputatio) 08:27, 27 Iulii 2010 (UTC)[reply]

The quality of the list of 1000 topics[fontem recensere]

This list is deeply flawed, in at least three ways: (1) it favors European culture over that of other continents; (2) it favors the present over the past; some of its terms, even within the limits set by (1) and (2), are strange. We necessarily stand amazed at the inclusion of Frederic Chopin & Salvador Dali & Steven Spielberg and a host of similarly particular biographies. (Even in a list of the world's 1000 most consequential people, not all these worthies, excellent though their contributions to culture might be, may rightly qualify.) If the point is to find the world's 1000 most important topics, I once suggested the abolition of biographies altogether, but the compilers of the list demurred. If someone wants to develop an alternate list, I'd be willing to help. IacobusAmor 12:37, 26 Iulii 2010 (UTC)[reply]

Someone is working on an extended list. --SECUNDUS ZEPHYRUS 22:19, 26 Iulii 2010 (UTC)[reply]
Actually, I remember one person making a very interesting statement on a disputatio about the 1000 list. Someone wrote that a particular article was not very well-known, and he replied, saying that the point of the list is to find topics that are less known in order to broaden the whole world's horizon. I like the idea of broadening others' horizons (including my own), but I don't think that's the point of the list. I think that specific criteria should be laid out as to what kinds of articles really should be there. --SECUNDUS ZEPHYRUS 22:24, 26 Iulii 2010 (UTC)[reply]
I agree, that can never be the true aim of such a list. To broaden horizons, we need to work on the topics that aren't in the "canon": as soon as any topic gets into the "canon" we would need to move on to another one. Broadening horizons is indeed exactly what Wikipedias do -- but a list of 1000 chosen topics hardly helps with this. Andrew Dalby (disputatio) 08:25, 27 Iulii 2010 (UTC)[reply]
On that point, we're all in agreement. To "broaden the world's horizons," hire an advertising agency! ¶ My view of the enterprise is that each of the 1000 articles should be the cognitional hub to which the most numerous other articles link. Reckoning Brahms, Chopin, Dvořák, and Wagner among the 1000 is therefore a less effective plan than having an article on, say, "The 'Romantic' Period in Music, 1828–1913," which would be the hub from which links to their individual articles would radiate. A reasonable next step would be to abolish all other musicians' biographies and institute analogous articles, such as "The Classical Period in Music, 1750–1828," "The Baroque Period in Music, 1600–1750," "The Renaissance in Music, 1400–1600," and so on (or maybe even just "The Renaissance" would be the true hub, eliminating the need for "The Renaissance in Visual Art," "The Renaissance in Literature," "The Renaissance in Science," and such). Such articles would necessarily involve numerous composers, performers, critics, means (instruments, orchestras), methods (stylistic traits), genres, and so on. Putting Chopin on a par with true hubs like Astronomia and Planta is rather like comparing a tree—and a slender one at that!—with a forest. ¶ Another approach, if an appropriate script could be written, would be empirical: to find out which 1000 articles are already the most linked-to in the big wikipedias and use them as the 1000 paginae; or, if a little human tinkering were desirable, to find, say, the top 1500 and then debate which of them should be cut. IacobusAmor 11:58, 27 Iulii 2010 (UTC)[reply]

order from least to greatest[fontem recensere]

Hey, would it be possible in future dumps to organize results in order from smallest to largest? Is that a good idea? What do you think? Best, --Ioscius 12:32, 12 Augusti 2010 (UTC)[reply]

We can do this now, by clicking on the little grey arrows next to "Weighted size". One click brings the smallest pages to the top; a second click brings the largest pages to the top. Andrew Dalby (disputatio) 12:36, 12 Augusti 2010 (UTC)[reply]
Nifty!--Ioscius 12:44, 12 Augusti 2010 (UTC)[reply]

Res Novae Russiae (1917)[fontem recensere]

Ubi in indice est hic commentarius? Eum non videmus. IacobusAmor 13:29, 13 Augusti 2010 (UTC)[reply]

Come on, that's no fault in the script! The link to :la was only added to the English page on 10 August (by a certain J. W. Love). See [1]. So it'll appear in the list, no doubt, when SZ next runs the script. Andrew Dalby (disputatio) 13:58, 13 Augusti 2010 (UTC)[reply]
Oh, OK. One can't remember everything, especially after a day of proofreading a magazine. Is the running of the script going to be a regular occurrence, say, on Saturday evenings? Several of us are using it to make progress on articles that are far more important than many. (Please enjoy how indirectly & delicately phrased that is.) IacobusAmor 14:02, 13 Augusti 2010 (UTC)[reply]
That is indeed the most maddening of occupations (proof-reading, I mean). I truly sympathize :) Andrew Dalby (disputatio) 14:07, 13 Augusti 2010 (UTC)[reply]

"there must be some kind of lag because it says 3 are missing"[fontem recensere]

Yes, we've noted a timelag before, but is there a way of making the program name the missing ones? IacobusAmor 17:35, 27 Augusti 2010 (UTC)[reply]

I'm not a good enough programmer to do that. Right now it's a choice between: (a) listing them in Latin but with blanks for the missing articles or (b) all English titles with links to en.wiki, without blanks. Good news is, I don't think there will be any missing articles in a few days!
I will, however, see if I can figure out a way of doing it for the future, when they will inevitably change the list... --SECUNDUS ZEPHYRUS 18:52, 27 Augusti 2010 (UTC)[reply]
If one or two are truly missing, we'll find out in a few days, when the script is run again over at Meta, because that program lists any missing ones. According to that list, we had 28 missing last month, so we've made good progress this month, though quite a few of our stipulas are more like stipuliculas. ;) IacobusAmor 19:50, 27 Augusti 2010 (UTC)[reply]
I have in any case added Media communicationis socialis since SZ's script was run, so that accounts for one. Andrew Dalby (disputatio) 20:30, 27 Augusti 2010 (UTC)[reply]
If SZ could run it again on Sunday or so, we'd have a day to tie up any loose ends before the last day of the month and the next calculation at Meta. IacobusAmor 03:14, 28 Augusti 2010 (UTC)[reply]
Sorry, I won't have another chance to run it again before the start of the month, so I ran it just now. --SECUNDUS ZEPHYRUS 16:00, 28 Augusti 2010 (UTC)[reply]
OK, thanks! IacobusAmor 16:20, 28 Augusti 2010 (UTC)[reply]
Macte! That may have been the last one. As of the start of the month, only two other wikis—en & simple—had articles for all 1000 items on the list, and only seven other wikis had articles for 999. IacobusAmor 03:13, 28 Augusti 2010 (UTC)[reply]

Dubai[fontem recensere]

The people who manage the main list have deleted en:United Arab Emirates and replaced it with en:Dubai. Would someone like to create that article? If not, Vicipaedia's coverage will fall to 999, instead of 1000. IacobusAmor 01:41, 21 Septembris 2010 (UTC)[reply]

Salve![fontem recensere]

Welcome back, and thanks for updating the list. It's quite useful! IacobusAmor 10:49, 26 Maii 2011 (UTC)[reply]

Sure thing! I've been checking up on things every so often, but I got bogged down with a lot of school work, so I didn't have time to actually contribute anything. Great to be back! --SECUNDUS ZEPHYRUS 00:18, 27 Maii 2011 (UTC)[reply]