library-it - Tumblr blog

library-it · 2 years

Text

Usage of Fedora Repository

Index process is running now. I add new collections to it. The migration from Fedora 4 to Fedora 6 was failure, thats why i adding the whole collections list from the beginning.

I use Fedora Repository as a long term archive, but not only for it.

Fedora is a strorage for Madoc (iiif.crossasia.org)

- Fedora -> storage

- IIIF Image server -> transformation of images

- Madoc -> display and annotate manifests

#madoc #fedora repository #iiif

0 notes

library-it · 2 years

Text

Fedora Repository

I use fedora repository to store different arts of data. All my sources i add in this tool.

Normally i have following structure:

collection -> book -> page -> image

In extra folder i save data that are useful for creating a collection.

In this article i publish typical commands that i use in fedora: https://library-it.tumblr.com/post/662968560931504128/commands-for-fedora-repository

I use Tomcat (v9) as application server. In setenv.sh file in tomcat/bin are all settings responsible for connection between fedora and tomcat.

0 notes

library-it · 2 years

Text

Install Madoc on Linux server

1. Install Docker und Docker container

https://library-it.tumblr.com/post/666025810043518976/commands-for-docker

https://docs.docker.com/compose/install/

2. Clone Madoc to your folger: git clone https://github.com/digirati-co-uk/madoc-platform.git

3. Copy .env file in your project folger. This file contains Postgres database credentials, MADOC_INSTALLATION_CODE.

4. Create database in Postgres and 5 shemas.

5. Create roles:

CREATE ROLE madoc_ts WITH LOGIN ENCRYPTED PASSWORD 'madoc_ts';

CREATE ROLE tasks_api WITH LOGIN ENCRYPTED PASSWORD 'tasks_api';

CREATE ROLE models_api WITH LOGIN ENCRYPTED PASSWORD 'models_api';

CREATE ROLE config_service WITH LOGIN ENCRYPTED PASSWORD 'config_service';

CREATE ROLE search_api WITH LOGIN ENCRYPTED PASSWORD 'search_api';

6. Create schemas:

CREATE SCHEMA config_service AUTHORIZATION config_service;

CREATE SCHEMA madoc_ts AUTHORIZATION madoc_ts;

CREATE SCHEMA models_api AUTHORIZATION models_api;

CREATE SCHEMA search_api AUTHORIZATION search_api;

CREATE SCHEMA tasks_api AUTHORIZATION tasks_api;

7. Delete schemas, if nessesarry:

DROP SCHEMA if exists config_service,madoc_ts,models_api,search_api,tasks_api CASCADE;

8. Add extension:

CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

9. Extension should be adressed to models_api schema

10. Pull project: docker-compose -f docker-compose.yml pull

11. Start project: docker-compose -f docker-compose.yml up –d

12. Display logs and save in file: docker-compose -f docker-compose.yml logs > log.txt

13. Stop server: docker-compose -f docker-compose.yml stop

14. Remove containers: docker-compose -f docker-compose.yml rm

15. First start: Just enter your server name in browser without port (https://lx0015.sbb.spk-berlin.de/). If everything is correct you'll see Madoc page with code question. In .env file you shoild add madoc installation code. In a web page just type madoc and click next. On the next page fullfill the form. Create admin user/password and a start page name. After it you can start to work with madoc.

16. Add crossasia theme:

docker ps - see all containers

docker cp crossasia-theme/ 4c64d5f9dfd4:/home/node/app/themes - copy theme to container

stop and start application

1 note · View note

library-it · 3 years

Text

Linux commands

Hi. Here i publish a list of different Linux commands, that helps me in my life:

- Commands for Solr

- Commands for Linux

- Commands for Windows

- Commands for Docker

- Commands for Fedora repository

- Commands for Regular expressions

#linux #solr #docker #Postgesql

1 note · View note

library-it · 3 years

Text

Commands for Docker

I publish here a small list of docker commands regarding my project:

Install Docker in Linux server:

https://docs.docker.com/install/linux/docker-ce/debian/

Install Docker-Compose

https://docs.docker.com/compose/install/

Docker commands:

systemctl restart docker

service docker restart

docker exec -i -t madoc-sbb-standalone-omeka /bin/bash -> enter to docker container

docker exec -i -t 2630cc83227c /bin/bash -> enter docker container

docker ps -> see all containers list

docker-compose -f docker-compose.server.yaml up -d -> start

docker-compose -f docker-compose.server.yaml pull -> update

docker-compose -f docker-compose.server.yaml stop -> stop

docker-compose -f docker-compose.server.yaml rm -> remove

#docker #docker-compose

0 notes

library-it · 3 years

Text

Add Solr Core without restarting the whole Solr instance

Yes it's possible here are the steps:

1. Create a new core:

start command from solr/bin

solr create_core -c ajax-skqs -p 8995 -force

2. Change the core properties, delete unused files.

3. Restart concrete Solr Core from frontend part.

4. Add your json files to the solr core:

curl http://b-app66:8985/solr/ajax-dfz/update -H "Content-Type: application/json" --data-binary @books.json

5. Restart concrete Solr Core from frontend part again.

#solr

0 notes

library-it · 3 years

Photo

10 постов!

#10 posts #tumblr milestone

0 notes

library-it · 3 years

Text

Commands for Regular expressions

1. Remove empty lines.

^[ \t]*$\r?\n

2. Delete all odd lines in notepad++

Ctrl+H

Find what: .+\R(.+)

Replace with: $1

Replace all

3. Delete Text between tags

<[^>]*>

4. Add quotation to number

\d\d\d\d.\d\d.\d\d to "$&"

#regular expressions

1 note · View note

library-it · 3 years

Text

Commands for Fedora repository

1. Add archive to Fedora repository *.tgz

curl -u fedoraAdmin:fedoraAdmin -i -X PUT --data-binary @/mnt/fedora/raw/ajax-fo-japan/books_ajax_fo_japan.tgz -H "Content-Type: application/x-gtar" -H "Content-Disposition: attachment; filename=books_ajax_fo_japan.tgz" http://b-lx0005.sbb.spk-berlin.de:8082/fcrepo/rest/ajax-fo-china-japan/documentation/archive/books

2.Add archive to Fedora repository *.zip

curl -u fedoraAdmin:fedoraAdmin -i -X PUT --data-binary @/mnt/fedora/raw/ajax-fo-japan/pages_ajax_fo_japan.zip -H "Content-Type: application/zip" -H "Content-Disposition: attachment; filename=pages_ajax_fo_japan.zip" http://b-lx0005.sbb.spk-berlin.de:8082/fcrepo/rest/ajax-fo-china-japan/documentation/archive/pages

3. Add text file

curl -u fedoraAdmin:fedoraAdmin -i -X PUT --data-binary @/mnt/fedora/raw/ajax-fo-japan/README.md -H "Content-Type: text/html" -H "Content-Disposition: attachment; filename=README.md" http://b-lx0005.sbb.spk-berlin.de:8082/fcrepo/rest/ajax-fo-china-japan/documentation/info/readme

4. Add JSON-LD file

curl -u fedoraAdmin:fedoraAdmin -i -X PUT -H"Content-Type: application/ld+json" --data-binary @README.json http://b-lx0005.sbb.spk-berlin.de:8082/fcrepo/rest/documentation

5. Delete resource

curl -X DELETE "http://b-lx0005.sbb.spk-berlin.de:8080/fcrepo/rest/dllm/collection"

curl -X DELETE "http://b-lx0005.sbb.spk-berlin.de:8080/fcrepo/rest/dllm/collection/fcr:tombstone"

6. Start Fedora - Jetty

java -jar fcrepo-webapp-6.0.0-jetty-console.jar --port 8088

#fedora repository

0 notes

library-it · 3 years

Text

Commands for Windows

1. Print files list from folder to file Windows

dir > print.txt

2. Rename Files in Windows PowerShell Delete after specific Symbol

Get-ChildItem |Rename-Item -NewName { $($_.Name -split '_')[0]+".pdf" }

3. Copy files from multiple folder to one

for /r %d in (*.xml) do copy "%d" "F:\temp"

4. Remove the same part of a file name for many files in Windows

ren "SZFZ*.xml" "////*.xml"

find . -type d -not -empty -exec echo mv SZFZ ;

0 notes

library-it · 3 years

Text

Why i use Apache Solr?

Solr is good to search in the text data, especially if you have a lot data. In the library of course i have it enough.

I have a list of Solr collections that i display with ajax-solr library.

I modified it a little bit, look in code.

You can see how it looks finally on our website.

#apache solr #solr

0 notes

library-it · 3 years

Text

Commands for Linux

Hi. Here is my Linux command list, that helps me. Of course it's not all commands, but i'll try to improve this list.

1. Delete \r from bash script shell

tr -d '\r' < file.sh > fileR.sh

2. Pdf to text split per pages

for f in OB1*.pdf; do for i in {1..999}; do pdftotext -f "$i" -l $l "$i" -layout $f "${f%.pdf}_$i.txt"; done; done

Pdf to Text, without spliting

for file in *.pdf; do pdftotext -layout "$file"; done

3. Show Tomcat

ps -ef | grep tomcat

4. Clean Tomcat/Catalina.out file

echo > catalina.out

5. All files into one folger

find . -type f -print0 | xargs -0 -I file mv --backup=numbered file .

6. Add sleep to shell script to each n element (don't forget regular expression)

(.*\r?\n){1000}\K

sleep 30 \n

7. Change file mod

chmod 775 articles_2007_2009R.sh

8. Change file owner

chown andrey book.json

9.Change white spaces in file names with underline

for f in *; do mv "$f" `echo $f | tr ' ' '_'`; done

10. Show all processes

htop

11. Add underline insteard white space in file name

for file in *; do mv "$file" `echo $file | tr ' ' '_'` ; done

12. Split file into multiple

split -l 1000000 amd.json amd

grep -hnr "No OCFL mapping found for" nohup.out > output.txt

split -l 20000 pages.json amd

13. Compare number of files in different folgers

find -type d -readable -exec sh -c 'printf "%s " "$1"; ls -1UA "$1" | wc -l' sh {} ';' > file.txt

find . -maxdepth 1 -mindepth 1 -type d -exec sh -c 'echo "{} : $(find "{}" -type f | wc -l)" file$s$' \; > file.txt

14. Copy to remote server

scp -r *.* [email protected]:/srv/solr-crossasia-itr/data/ajax-minguo/data/index/

15. Number of Files

ls | wc -l

16. Display 4 files from folger

ls -U | head -4

17. Find images with concrete size

find -type f -regex "^.*\.$png\|jpg\|jpeg$$" -exec identify -format "%f, %w, %h\n" {} \; | awk -F ',' '$2 > 800 && $3 > 600'

18. Nohup start file

nohup ./file.sh &

19. Kill Nohup

ps -ef |grep nohup

pkill -9 -P <parent pid>

kill -9 -PID

20. Find files with 0 byte

find -size 0 -print

21. Remove duplicates

sort -u big-csv-file.csv > duplicates-removed.csv

22. tar unzip

tar vxf archiv.tar

23. tar zip

tar cvf archiv.tar archivordner

24. tgz unzip

tar -xzvf archiv.tgz

25. tgz zip

tar cvfz archiv.tgz archivordner

26. tar.gz unzip

tar xfvz [ARCHIV].tar.gz

gzip -d archiv.tar.gz

27. tar.gz zip

gzip -9 archiv.tar

28. Convert images from *.tif to *.jpg

convert 00000010.tif -quality 80 -resize 30% test.jpg

convert *.tif -quality 80 -resize 30% -set filename:base "%[basename]" "%[filename:base].jpg"

29. Split file

lines=1600000; { read header && sed "1~$((${lines}-1)) s/^/${header}\n/g" | split -l $lines --numeric-suffixes=1 --additional-suffix=.txt - pages_ ; } < pages.csv

30. Find characters in file

find nohup.out -type f -print | xargs grep "HTTP/1.1 5"

#linux

0 notes

library-it · 3 years

Text

Commands for Solr

Hi. Here a list of Solr commands, that i typical use:

1. Create new collection:

solr create_core -c ajax-skqs -p 8995 -force

2. Rename Solr core:

http://b-app66:8995/solr/admin/cores?action=RENAME&core=ajax-diaolong-yldd&other=ajax-dl-yldd

3. Upload Json file into Solr collection:

curl http://b-app66:8985/solr/ajax-dfz/update -H "Content-Type: application/json" --data-binary @books.json

4. Export data from Solr using:

Here i use following library

./run.sh -s http://b-app66:8985/solr/rep-diaolong-shiliao -a export --skipFields system_create_dtsi,system_modified_dtsi -o /data1/solr/diaolong-shiliao/pages.json

#solr #apache solr

0 notes

library-it · 3 years

Text

Solr tool for fulltext search

Hi. As you know Solr is a good tool to save and to search in text data. if i compare it with relational database i see a long list of advantages.

It's really fast!!!

In this post i show some tools, that i use in Apache Solr.

For export data from Solr into Json file i use import/export tool.

It's very easy and you can export not all parameters, but a list of you really need.

To import data into Solr collection i use a standard command in Linux.

#apache solr #fulltext search

0 notes

library-it · 3 years

Text

Applications Stack

Hi. In this post i'll talk about data pipeline in the library. We have raw data in different formats. It can be databases, txt files, pdfs and other text formats. In the next posts i refer detailed about each step in pipeline. First is a text transformation into json and saving it in Apache Solr.

1. I transform data to json to have an opportunity to save it in Apache Solr -> https://solr.apache.org/.

With java script library ajax-solr i have a web view.

Hier a link how it looks on our website: link

2. Fedora repository -> https://duraspace.org/fedora/

All data, not only text, but original formats, images, video and do on, i save in Fedora repository. In this case i modify already existing Json files into JSON-LD files and ingest it into Fedora.

3. I save their (Fedora repository) IIIF data too. I create collections and manifests and with IIIF Image Server display it with Madoc.

Here is the main steps in my stack

#solr #apache solr #fedora repository #iiif #digirati #madoc #json-ld

0 notes

library-it · 3 years

Text

My Stack tools

Here i’ll post a list of applications, that helps me with my work. If you have some ideas please suggest it. You’ll see a mix of tools for Windows and Linux.

1. idea intellij -> https://www.jetbrains.com/de-de/idea/, best tool for java programming. I use it for java script too.

2. notepad++ -> https://notepad-plus-plus.org/downloads/, it good but not for all situations. If you we’ll work with files bigger than 200 Mb, think about something else.

3. emeditor -> https://emeditor.com/, ideal for huge files

4. putty -> https://www.putty.org/ ssh client

5. json online editor -> https://jsoneditoronline.org/ best online json editor

6. regular expressions -> https://regex101.com/ online tool

7. Postgres viewer for Linux -> pgadmin.org

8. Xming Server -> http://www.straightrunning.com/XmingNotes/

My idea intellij works on Linux server and with Xming Server i connect it via my Windows machine.

9. WinSCP -> https://winscp.net/eng/download.php

10. Filezilla -> https://filezilla-project.org/

11. Chrome extension for Json https://chrome.google.com/webstore/detail/json-formatter/bcjindcccaagfpapjjmafapmmgkkhgoa/related

#idea intellij #emeditor

0 notes

library-it · 3 years

Text

Hi. First Post

Hi my name is Andrey from Berlin. I am working in state library of Berlin as computer scientist. In this blog i’ll post everything about a technology stack, my tasks and how i solve it.

1 note · View note