Urban75 Home About Offline BrixtonBuzz Contact

website database organisation help, please.

TremulousTetra

prismatic universe
I wonder if you can help me. This is my website http://www.resistancemp3.lpi.org.uk/ . As you can see it is a catalogue of political speeches/talks/lectures. The website started off with about 10 files, but has now grown to 350 plus, and with this growth it is now getting somewhat unmanageable. So I wonder if people could advise me.

Some people have suggested that it may be better for me to move towards a database driven web site. I have no idea how I would do this, (so could people point me in the right direction as to what I need to learn to do this) but I do have an idea what I would want it to be able to achieve.

I would like to be able to create a database of my files which had a column for the titles of the files as they appear on the website now, the name of the files as they appear on my hard disk, the link to the files as they exist on the Internet, and columns for the various categories/web page which when checked the file would appear on an appropriate page on my website.

Rather than people doing a search, like the search engine I have now on my website, and pulling individual files, I would prefer it if any such search pulled the whole page/category of files upon which the file searched for resides. It would be handy if when searched for it did pull the right spot on the web page where the file is listed.

I also have another problem in this area which requires a different set of skills to circumvent. I suppose basically I can sum it up by saying, I am not a librarian, I do not know how to categorise things properly. Using the database system should make it easier to be able to put the same file in several different categories (which is a major shortfall of the current system of categorisation), but I am still left wondering is there a better way of organising and categorising these speeches. Does anybody have any ideas?

ResistanceMP3
 
Mmmm, relational database schemas :)

Right, this is (the first draft of) how I would do it. Fire up your copy of Access and make a new table. It needs fields for:

ID (the primary key, autonumber, always unique. Everything everywhere has to have a unique ID).
File name (not the path, just the name - and I'm assuming that files have the same name on your local hard drive as they do on your site)
File title (what you want it to be called in the link or whatever)

So that's your "Files" table.

You now need a "Pages", or "Categories" table. Lets keep it simple: you need a table with fields for:
ID (always, always, the ID)
Category name (Some text. A description. Whatever.)

Right. Table one, table two. They need to be related. This is how:

Table 3: "Files-Categories"
This table tracks the many-to-many relationship between files and categories. One file can be in many categories, and one category can contain many files:

ID (primary key, unique number, the usual)
File_ID (the ID of the file in question)
Category_ID (the ID of the catagory you want to tie it to)

There. By adding rows to this table, you can tie as many files to as many categories as you want. And by expanding on exactly the same principle, you can create relationships accross sub-genres, authors, dates and whatever else you like.
 
when I went to save the table Microsoft access asked me "Although the primary key isn't required, it is highly recommended. The table must have a primary key for you to define a relationship between this table and all the tables in the database. Do you want to create a primary key."

should I have clicked yes with this? with you creating the third table doesn't this do that job? And I should have told it no?

ResistanceMP3.
 
and should I have two more tables for author and year.

also my files are on the the Internet on three different servers, so at different link addresses, does this make a difference?
 
ResistanceMP3 said:
when I went to save the table Microsoft access asked me "Although the primary key isn't required, it is highly recommended. The table must have a primary key for you to define a relationship between this table and all the tables in the database. Do you want to create a primary key."

Why are you using Access...? Mysql + Php will do the job quite ok...
 
You don't need a separate table for authors if it's just a case of "one file, one author". In that case, you just need another field (called "AuthorName" or something like that) in the "Files" table. If you do have many authors for each file, then yeah, make an "Authors" table, with an ID field and an AuthorName field, and then make another table called "Files-Authors" with ID, File ID and Author ID fields.

All tables should have a primary key field. It should always be a unique number (Access provides the "autonumber" field type for this very purpose).

jaed - yeah but this is early stages. We don't even know what hosting he has available yet. In order to get a database schema down, Access is a quick and simple tool to work with, that we can later export the schema from if we want to use a proper database server online. I wouldn't recommend running a site off Access, but it'll work in a pinch. But that's a whole other thread.
 
jæd said:
Why are you using Access...? Mysql + Php will do the job quite ok...
Wintermute said:
Mmmm, relational database schemas :)

Right, this is (the first draft of) how I would do it. Fire up your copy of Access and make a new table. It needs fields for:
and cos i got it. wot the fuck is Mysql + Php :confused:
 
MySQL is a database server. Works like Access, but bigger, faster, and you ideally need to know a bit of SQL to get it set up right. Your hosting provider will be able to tell you if it's available on your hosting package - they might offer different database servers like PostGres or even Microsoft's SQL Server.

PHP is a scripting language for server-side stuff (like, say, talking to databases). There are several such scripting languages; again, ask your hosting provider what's available. You're going to need some server-side scripting when you get your database up online, in order to build pages that get the data out of it.
 
Wintermute said:
All tables should have a primary key field. It should always be a unique number (Access provides the "autonumber" field type for this very purpose).
not strictly true, they should have a primary key, but it doesn't have to be a number, the important thing is that it is unique for each row of data.

It could even be a combination of fields that are a unique identifier.
 
Wintermute said:
MySQL is a database server. Works like Access, but bigger, faster, and you ideally need to know a bit of SQL to get it set up right. Your hosting provider will be able to tell you if it's available on your hosting package - they might offer different database servers like PostGres or even Microsoft's SQL Server.

PHP is a scripting language for server-side stuff (like, say, talking to databases). There are several such scripting languages; again, ask your hosting provider what's available. You're going to need some server-side scripting when you get your database up online, in order to build pages that get the data out of it.
I suppose my first problem here, is that I have the files spread across several different servers. I tried to involve other people who volunteer Web space on an ad hoc basis. If they all use different PHP etc will all this make the endeavour difficult or impossible?

I have started with the three tables. I have created them, and added some data into the files table and the category table. Because I have my files on several servers, will I need another field in the files table to contain the link address, which will obviously be different because the files are located on different servers.

by the way, when Microsoft access asked me on closing the tables "Although the primary key isn't required, it is highly recommended. The table must have a primary key for you to define a relationship between this table and all the tables in the database. Do you want to create a primary key." I answered yes, I hope this was okay.

ResistanceMP3
 
most files appear like this;
Capital and the State, 1987, BINNS, Peter

a large number appear like this;
The Arabs Alexander Anne, Discussion & Summing up

a small number appear like this;
State Capitalism, Discussion, Sum up, 1999 Cliff Tony



I suppose I now need to retitle, not rename the file, each file that is titled "discussion and summing up", "discussion", and " sum up"? to be more descriptive, in line with the original title for example the discussion link for state capitalism should be titled "state capitalism discussion", and "state capitalism sum up "? yes?

ResistanceMP3
 
But will you never want your primary key to be changed, do you? Tables relating to each other have to make sure that their primary/foreign key doesn't change over time.

What is when your primary is a conbination of authour and title and author is marrying and getting another name? You update the authors name and all songs from here are gone. Foreign key didn't update, so better have a stupid number which connects both tables. When updating a name nothing is lost, since still this dull number is there, only a name changed.

Do you forgot your friends when they change their names? Machines do, so it's better for them to remember only number not names.
 
  1. Don't use access - it's not very useful on the internet (you just have to access the .mdb file directly without any DMS in between which means huge problems with concurrency)
  2. What you want is a simple database with one table, having columns for all the details of the various entires that you want to list, including a column for links to the target files. You should add a few different categorisation fields as well - one column each for region, type, topic, language, event for example (although 5 dimensions might be overdoing it.
  3. Then you need a program which will take this data out of the database and display it on a web page.
  4. The easiest way to do this is to find a free software program on the internet that somebody else has written to do a very similar job - there will be many out there.
  5. The second easiest way to do it is to use a MySQL database and PHP or something similar.
  6. Both the easiest way and the second easiest way require quite a bit of work and fairly complicated stuff to set up for a non-techie.
  7. It doesn't really matter where the files are stored as you'll only be putting links in the database.
 
Suggest you get a techie to do it for you (c'mon everyone with time on their hands, help the dude out...), creating a db for the 1st time is no easy matter.
Definitely use mySQL otherwise you'll be tying yourself into a microsoft server as well. Better than having the text on files, get it imported into a text field on your table then you can set up a fulltext index and make it searchable.
 
I accept what people are saying above Microsoft, because I started designing website with front-page, and had to change to Dreamweaver because Microsoft put so much junk in the web page code.so I will probably get a lighter database program and redo what I have done. At least I have learnt how to start off.

Somebody has suggested this;
what you have is often known as a "file gallery". There are lots of web sites that do the same thing (albeit with different content). You need some software that does this for you. The step of using proper software separates setting up and configuring it (the technical bits) from the relatively non-technical job of operating it, uploading files, categorising them etc.

From what you say, you'd need someone else to set it up for you, and then you could run it. Running it is likely to be less technical than using the windows software as you currently do - the software will do more for you.

Ideally you'd want a tool that produced an RSS feed of any new files you'd uploaded.

The first step would be to get someone to identify suitable software for you. Any volunteers?
 
Juice Terry said:
not strictly true, they should have a primary key, but it doesn't have to be a number, the important thing is that it is unique for each row of data.

It could even be a combination of fields that are a unique identifier.


True. My bad. Although it's usually easier to program against if you stick to a single data format for functionally identical fields types, and the easiest format to do that with, is usually an integer. You know where you are with integers.


gurrier said:
  1. Don't use access - it's not very useful on the internet (you just have to access the .mdb file directly without any DMS in between which means huge problems with concurrency)
  2. What you want is a simple database with one table, having columns for all the details of the various entires that you want to list, including a column for links to the target files. You should add a few different categorisation fields as well - one column each for region, type, topic, language, event for example (although 5 dimensions might be overdoing it.
  3. Then you need a program which will take this data out of the database and display it on a web page.
  4. The easiest way to do this is to find a free software program on the internet that somebody else has written to do a very similar job - there will be many out there.
  5. The second easiest way to do it is to use a MySQL database and PHP or something similar.
  6. Both the easiest way and the second easiest way require quite a bit of work and fairly complicated stuff to set up for a non-techie.
  7. It doesn't really matter where the files are stored as you'll only be putting links in the database.


1. I'm not suggesting using Access in a live environment. I'm suggesting that it's the quickest and easiest way for an unexperienced person to start creating relational database structures.
2. That's not even a relational database design, let alone a well-considered one. It's not scalable, it contains a large amount of redundant data, and any DBA in the world would look at it and sigh ruefully. Go and read about normalization.
3. Technically, it's not a program, it's an interface.
4. Possibly true, although TBH I could probably get a bespoke version of something like this running in about an hour using existing code snippets, and many 'off the shelf' CMS's offer levels of complexity that the user doesn't need, or use an approach to categorization that doesn't suit the data. It's a personal decision.



ResistanceMP3 said:
by the way, when Microsoft access asked me on closing the tables "Although the primary key isn't required, it is highly recommended. The table must have a primary key for you to define a relationship between this table and all the tables in the database. Do you want to create a primary key." I answered yes, I hope this was okay.

It IS ok, as long as Access has made the right field the primary key. If you've created one field as an Autonumber then Access should automatically choose that one. If it hasn't, you need to change it. The Primary Key field is the one with a little key symbol next to it (far left) when you look at the table in Design View. To set a field as Primary Key, right-click the grey area of the row at the far left and select "Set Primary Key" from the context menu that pops up.


ResistanceMP3 said:
I suppose I now need to retitle, not rename the file, each file that is titled "discussion and summing up", "discussion", and " sum up"? to be more descriptive, in line with the original title for example the discussion link for state capitalism should be titled "state capitalism discussion", and "state capitalism sum up "? yes?

Ok: 'Files' table, take two:

ID (autonumber, primary key)
file_URL (text, the full URL of the file wherever it may be located)
file_title (text, something descriptive. Doesn't really matter what.)
file_year (number is easiest. Can be datetime but date formats in Access is a can of worms you really don't want to be putting your fingers in)
file_author (text, yada yada)

Is that clearer?

The "file gallery" thing - yep, that's one way. It's what gurrier was mainly referring to with the "free software program on the internet that somebody else has written to do a very similar job". It's up to you: use a ready-made system that might or might not do what you want it to, that you don't know exactly how long will take to set up, that might break and leave you with no idea how to fix it, or invest some time in learning how to do it yourself. I know which I'd choose, but I'm not in your position.
 
Wintermute said:
2. That's not even a relational database design, let alone a well-considered one. It's not scalable, it contains a large amount of redundant data, and any DBA in the world would look at it and sigh ruefully. Go and read about normalization.
3. Technically, it's not a program, it's an interface.
you arse. 'not a program, it's an interface' - the proper adjective to use with such a phrase is 'bushitistically' not 'technically'.
 
Wintermute said:
True. My bad. Although it's usually easier to program against if you stick to a single data format for functionally identical fields types, and the easiest format to do that with, is usually an integer. You know where you are with integers.





1. I'm not suggesting using Access in a live environment. I'm suggesting that it's the quickest and easiest way for an unexperienced person to start creating relational database structures.
2. That's not even a relational database design, let alone a well-considered one. It's not scalable, it contains a large amount of redundant data, and any DBA in the world would look at it and sigh ruefully. Go and read about normalization.
3. Technically, it's not a program, it's an interface.
4. Possibly true, although TBH I could probably get a bespoke version of something like this running in about an hour using existing code snippets, and many 'off the shelf' CMS's offer levels of complexity that the user doesn't need, or use an approach to categorization that doesn't suit the data. It's a personal decision.





It IS ok, as long as Access has made the right field the primary key. If you've created one field as an Autonumber then Access should automatically choose that one. If it hasn't, you need to change it. The Primary Key field is the one with a little key symbol next to it (far left) when you look at the table in Design View. To set a field as Primary Key, right-click the grey area of the row at the far left and select "Set Primary Key" from the context menu that pops up.




Ok: 'Files' table, take two:

ID (autonumber, primary key)
file_URL (text, the full URL of the file wherever it may be located)
file_title (text, something descriptive. Doesn't really matter what.)
file_year (number is easiest. Can be datetime but date formats in Access is a can of worms you really don't want to be putting your fingers in)
file_author (text, yada yada)

Is that clearer?

The "file gallery" thing - yep, that's one way. It's what gurrier was mainly referring to with the "free software program on the internet that somebody else has written to do a very similar job". It's up to you: use a ready-made system that might or might not do what you want it to, that you don't know exactly how long will take to set up, that might break and leave you with no idea how to fix it, or invest some time in learning how to do it yourself. I know which I'd choose, but I'm not in your position.
I have recreated the file table as you suggest, but as soon as I tried to put the text in the URL field it tells methe text is too long to be edited. this is what I try to put in http://mp3.lpi.org.uk/resistancemp3/cybertariat-women-and-work-in-the-21st-century.mp3
 
ResistanceMP3 said:
so I will probably get a lighter database program and redo what I have done. At least I have learnt how to start off.

I'd really reccommend you ditch Access now. It might be a good learning tool but it isn't going to go anywhere. The task you've set yourself, while not complicated, is not easy. If you comfatable use PHP (or at least learning to) then you should really find someone to do this for you.
 
ResistanceMP3 said:
I have recreated the file table as you suggest, but as soon as I tried to put the text in the URL field it tells methe text is too long to be edited. this is what I try to put in http://mp3.lpi.org.uk/resistancemp3/cybertariat-women-and-work-in-the-21st-century.mp3

If you go into Design View on the table and click the URL row, then the allowable length (I think the default is 50) should be in the properties list at the bottom of the screen. Change it to 150 (or so :))

jæd said:
I'd really reccommend you ditch Access now. It might be a good learning tool but it isn't going to go anywhere. The task you've set yourself, while not complicated, is not easy. If you comfatable use PHP (or at least learning to) then you should really find someone to do this for you.

Well.. they're valid points but it's a bit of a defeatist attitude isn't it? Yes, you're going to need a database server at some point. And yes there's going to be coding involved. It's not that easy - but it's not that difficult either. There's plenty of free resources to help you learn. And you get to do it exactly the way you want it, and you get a bonus skill at the end of it :cool:
 
Wintermute said:
Well.. they're valid points but it's a bit of a defeatist attitude isn't it? Yes, you're going to need a database server at some point. And yes there's going to be coding involved. It's not that easy - but it's not that difficult either. There's plenty of free resources to help you learn. And you get to do it exactly the way you want it, and you get a bonus skill at the end of it :cool:

Just pointing out that starting off using Access is starting off on a dead end. And, knowing Access, you'll be bogged down sorting out Access problems. If you start of using Mysql or Postgres then at least you'll start off heading off in the right direction.
 
Wintermute said:
You are, that's true. And MySQL 5.0 is out :)
I'll probably move onto that, when I can get a copy. In the meantime just trying to learn some stuff.

so what is next?

resistanceMP3

PS. now the way my site is at the momentI have all these files with the same title "discussion and summing up". I might be jumping the gun, but surely I will have to change all those "discussion and summing up" titles that are exactly the same?
http://mp3.lpi.org.uk/resistancemp/m2004home.htm
The Arabs 2004, Alexander Anne, Discussion & Summing up

Cybertariat women & work in the 21st century 2004, Huws Ursula, Discussion & Summing up

Did Lenin Lead To Stalin 2004, Molyneux John, Discussion & Summing up

Iran 25 years after the revolution 2004, Jafari Peyman, Discussion & Summing up

Italy & The Birth Of Fascism 2004, Cookson Matthew, Discussion & Summing up

Joyce Ulysses & Modernism 2004, Kennedy Sinead, Discussion & Summing up

Left Wing Communism 2004, Hayes Dave, Discussion & Summing up
 
This is painful. Watching the blind lead the blind.

Resistance expresses (his?) requirements well and the various issues and opportunities both for making the website easier to manage and improving the service that it offers.

It's not possible to learn how to design a database-driven website from a single forum thread. It just isn't.

While this problem isn't rocket science, it's going to take at best, a few weeks of learning to get to the point where you really understand what's going on.

I'd agree with the poster that said it would be best to get someone who already knows how to do this to help. If possible, get them to do the job and explain as they go along. But don't expect it all to magically make sense straight away.

Choice of technology:

A dynamic website uses a database to store its content and a scripting language to generate the HTML pages as they are requested/viewed.

There are many choices for both the database and the scripting language, but the biggest choice is:

- do you want to choose a Microsoft system that means you will go to hell for all eternity? (Some believe God also kills kittens if you take this option)

- do you want to do the right thing and choose open source?

What this probably means in practice is:

Microsoft ASP.NET (scripting language) and Access or SQL Server (databases) (boo!)

or

PHP (scripting language) and MySQL (database) (huzzah!).

(I _know_ there are plenty of others; I've used most of them.)

Assuming you're going to do this yourself, you need to learn how to:

1. Design the database structure ("schema") so that it holds your data in a useful way; this is generally called database design or data modelling. This is the sort of thing that takes an afternoon to learn but a lifetime to master.

2. Write "queries" (bits of code) in SQL (structured query language) that put things into your database and get it out again.

3. Write programs in your scripting language that send the SQL queries to the database and create an HTML page from the data they get back from it.

If you can afford to buy a book, a good one will repay its price many times over.

If you're going the PHP route, try this:

http://www.sitepoint.com/books/phpmysql1/

You will find umpteen free tutorials of variable quality (but often very good) if you do a search for "PHP tutorial". This will take you to sites like Sitepoint, DevShed, OnLAMP, Zend and PHPBuilder. Plus the entire reference and some tutorials are at www.php.net, but that'll probably seem overwhelming to start with.

I hope that's enough to get you started.

Regarding better ways of categorising your files, you have three realistic options. You are entirely right that you are not a librarian and doing this is harder than it looks.

1. Each file goes in a single category (as it does currently).

2. Each file can go in multiple categories with no fixed limit to the number of categories each file can be in (but the categories are defined as a list to start with).

3. You don't have a fixed set of categories. When you add a file to the database, you type a list of keywords (as many as you like, whatever you like). The database stores these and your program provides a useful way of browsing and searching using these keywords. Commonly this is known as "tagging" and is popular on sites like www.flickr.com. Geeks call this a "folksonomy" (search for it).

These are listed in increasing order of compexity to program and design.

I strongly recommend that if you're going to do this as your first dynamic site, you start with option 1. As you learn more (particularly about the "many-to-many relationship" that Wintermute casually threw in), you can convert to option 2. Once you're really confident, consider option 3 if it suits your design and aims.

---

For those that wish to know or indeed care: (non-techies can leave the room)

1. The primary key for a junction table is (generally) a compound key of the two foreign keys. There is absolutely no need to define a separate serial ID. Doing so allows you to create duplicate many-to-many relationships. This isn't something that you want. Unless your program is smart enough to handle the possibility of redundant relationship data, things can go very wrong. Why provide the opportunity for that to happen?

2. Prefixing the column names in a table with the table name is redundant and stupid. I mean things like:

file_name
file_url
file_description

That's why we have table names. It's also why we can specify the table name explictly in a query, or use an alias, when we're doing a join.

file.name
file.url

or

f.name
f.url

etc.

3. Microsoft Access. Just say no. I know quite a bit about databases but Access baffles me entirely. Perhaps I'm more a code person than a visual person. Perhaps it just sucks.

4. If you create web apps and you haven't tried Ruby on Rails yet, you're really missing out. You may have heard the hype and been turned off by it. I can assure you it's more than justified.

www.rubyonrails.org

I'd be tempted to say that if you're starting from scratch, start with Rails. But it doesn't really make sense. It's an object-relational mapping framework, but ORM isn't perfect (in Rails, or elsewhere). So you need to know database design and SQL as well as object-oriented programming.

For those that can already design databases and write a scripting language, or even better, a proper OO language, Rails is an absolute dream to work with. It's not the final word on anything, but I've yet to come across anything that is as coherent and easy to work with on small to mid-sized projects.
 
untethered said:
Microsoft ASP.NET (scripting language) and Access or SQL Server (databases) (boo!)

or

PHP (scripting language) and MySQL (database) (huzzah!).

Or... Do you want to pay money for a complicated, non-scaleable, insecure product. Or do it cheaply and easily with a more scaleable version. Most of the reason I'm against bothering with Access is that most of the knowledge gained doesn't transfer to other dbs. Its clicky-button-mouse time compared to a "proper" and more powerful command line interface. Spend your time following the One Microsoft Way...? Or learn to create dbs and commuincate with them effectivly.

That, and the fact that I've been called over to be asked how I can make Ms Secretaries Access database into a online enterprise-scaled database for several zillion hits a second.

Access == Frontpage IMHO.

I don't want t bash ResistanceMP3 for learning. But any solution that ncludes Access hasn't been thought through...
 
ResistanceMP3 said:
I'll probably move onto that, when I can get a copy. In the meantime just trying to learn some stuff.

so what is next?

Mysql is free and downloadable. So your next step is learning how to use it.
 
jæd said:
Mysql is free and downloadable. So your next step is learning how to use it.
right just downloaded Mysql five.I have installed the program, think I chose the right options :confused: , try to start the program up and got a DOS prompt. So where can I go from here to learn in two weeks how to produce a database for my website?

ResistanceMP3.

Enter password: *******
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2 to server version: 5.0.15-nt

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql>
 
Back
Top Bottom