സഹായം:Match and split

Match and split

സഹോദര സംരംഭങ്ങൾ: വിക്കിഡാറ്റ ഐറ്റം.

For use with User:ThomasBot. Used to take edited text from the main namespace and to apply to consecutive scanned images in the Page: namespace. Note that there is a requirement that the page in Page: has a djvu text layer exists and is of a sufficiently reasonable quality to perform a match.

1) Via Special:Preferences, Gadget tab, turn on the Thomasbot Match and Split option

2) Identify the first page of the Page: from the DjVu file that corresponds with the text in the main namespace. This will require researching and reading the main namespace document, and reading the same document in the Page namespace.

MATCH

Adding MATCH

3) The initial edit is to add __MATCH__ and associated text to the main namespace page that corresponds to the first word on the respective Page: namespace page where the edit is done,

easy way is to click button in the toolbar and paste in the first corresponding page link in the Page: namespace; or

from first principles

Format:   ==__MATCH__:[[Page:La Vie littéraire, II.djvu/82]]==

ie. wrap in  ==   ==
add  __MATCH__
add  :
add  [[Page:in namespace.djvu/xx]] (it will be a redlink at this point)

Note: All formatting below the MATCH will be moved into the Page: namespace, so place Categories, etc. for the work above the __MATCH__, and remove any coding that straddles the match, eg. <div class="..."></div>

Hint: Paste the overarching formatting for the work, eg. <div class=… and like from the end of the work, to above the MATCH statement, and move it back upon completion of the SPLIT process.

4) If all is working properly, the __MATCH__ should be an active link.

Applying MATCH

5) Click the __MATCH__ and the job will start

Outcome from MATCH

6) When finished, the page will

be saved and reload,
now have aligned the text with the first page in the Page: namespace and the [[Page:...]] link will now be an active (blue).
all subsequently aligned Page: pages will be listed, inserted and active linked.
a [split] tab will appear at the top of the page.

STOP!

7) Read the next step carefully, and ensure that the previous process has been both successful and correct before proceeding to split.

Verify MATCH

8) Verify the text has successfully been aligned to each of the Page: pages created.

This may fail if the text in the Page: namespace is of insufficient quality.

Notes

Errors to look for ...
- Split quotation marks and other like formatting either side of a page link or ===level=== markers
- Incomplete matching, marked by no match. Can be the case where successive non-text pages are encountered. This will be marked by ===no match===

In this case, realign with a subsequent Match statement and again process the match further down the page, repeatedly if necessary, before performing the Split.

SPLIT

Applying SPLIT

9) To get Thomasbot to split the the one page to the respective multiple pages, click the [SPLIT] link. This label will now change to [splitting], please wait, don't click anything and be patient while it gets to work ... it will take a little while. [You can check on progress via robot activity page].

If the robot activity page shows that the job is complete, yet the page still shows [splitting] then clicking on the originating page at this point is okay.

10) When complete, the original page will reload with the page now transcluded with <pages /> nomenclature, and all the text loaded to the respective Page: namespace pages.

Verify SPLIT

11) Verify each page in the series has been split, and look to apply Proofread status through normal editing process.

Notes

Chapters in the main namespace will not always align with a complete Page:... end, ie. the next chapter starting on the same page. The Split process should create a <section> marker at the appropriate place. For the next chapter, insert the new Match statement at the appropriate place on the Page:... and save and process normally.
The bot will create relevant sections on the Page: and should continue smoothly.
That you have fixed any formatting, categories, and copyright tags that you placed before the MATCH to avoid them being moved with the page text.

Notes about <pagequality>

When the Split is undertaken, the text will show on the respective pages in Page: namespace as Not Proofread.

MATCH

SPLIT

Notes about <pagequality>

See also