Not signed in (Sign In)

Vanilla 1.1.4 is a product of Lussumo. More Information: Documentation, Community Support.

    • CommentAuthorxpizzle
    • CommentTimeNov 21st 2007
     
    Since there are not too many posts on the forum and since I am most definetly a forum whore I will throw another post on here. My question to you is ... what current project or projects are you working on ? I am sure a lot of people cannot specify in depth their projects, but at least may be explain something that is already built.

    As far as my projects go, I am currently working on making a Form Generator for myself. (Yes, I am quite aware PEAR has HTML_QUICKFORM2 and I am sure there are already made classes) I enjoy building my own classes for myself so I better understand what was written and it gives me an excuse to may be learn something I do not know !
  1.  
    The projects that I am constantly working on include TheWeddingVendor.com and HeresOurWedding.com. I also have another project which is an application where a sites owners can communicate with his/her visitors in real time. Basically it is a chat app, written in PHP and Javascript, with a twist. If there is a representative logged into the control panel the link will show up on the site. When a user clicks the link it will open a chat window and that person will be placed into a queue and wait for one of the reps. There can be many users and many representatives.

    The app is pretty Javascript intensive. There is a lot of AJAX requests and data returned in JSON form. I guess that isn't really AJAX then, is it? AJ? Anyway, it has been fun building it but it is not without it's problems. Of course there are problems with IE. Variations between JScript and Javascript. Damn Microsoft and their proprietary BS!!!
    • CommentAuthorxpizzle
    • CommentTimeNov 25th 2007
     
    Ha, I just picked up my first real freelance job, I need to scrape 3 different sites for a company for bets for UFC. I was told they have permission to do it, but I must say I am afraid of course of legal issues with web scraping. As of right now I am thinking about doing it in just PHP, but as I am now also learning python I will eventually make it in python because I cannot say that I am too familiar with how to keep a PHP script running on a server. If anyone could enlighten me that'd be great and also may be a way I can avoid all legal issues.
  2.  
    Why do you want to have the script constantly running? How often does your data change. You can schedule a cron job to run the script.

    You could technically have a script constantly running by doing something like this:
    while(true) {
    // do whatever you need to here
    sleep(60000); // The number of seconds before we run again
    }
    but I believe you will run into issues with execution time outs. I think the default time out is 30 seconds but you could change that. I wouldn't suggest doing it that way though because it seems rather inefficient. I think scheduling a cron job would be the best option, for PHP anyway. I am not familiar with Python.

    As far as the legal issues are concerned, do you have a contract with the site owner? They are technically the ones responsible for the site and it's actions. You can't sue the car manufacturer because some drunk bastard runs you over. If you question their ethics perhaps you can contact the owners of the sites to be scraped to see if they have exposed any web services or RSS feeds that you could use rather than scraping. If they freak out then you know there is something wrong but that may end up costing you the job!
    • CommentAuthorxpizzle
    • CommentTimeNov 26th 2007
     
    Okay, there is no contract as of right now with the site owner, but I am going to request one to be written I believe. It is someone I do kind of know, but still I would rather have things documented on paper and be safe. They were requesting to scrape once an hour, which I don't know how often the data changes, but it must be more frequent than both of us are guessing if they are requesting every hour to have it scrape. I will look in to cron jobs and hopefully with my response of how often they want it scrapped, may be there is another possible solution.
    • CommentAuthormgirouard
    • CommentTimeNov 26th 2007
     

    There’s no reason why PHP and a cron job couldn’t handle scraping content from a site. The problem you may run into would be if the site doesn’t have well-formed xhtml. That would make using SimpleXML and/or the DOM* classes somewhat difficult (but not impossible). Worse comes to worse, you could always use regular expressions…

    Still, make sure it’s cool with the site owner that you grab the content. Like Jason said, maybe they have an feed or web service you could leverage instead.

    Mike G.

    • CommentAuthorkimekime
    • CommentTimeAug 5th 2009
     

    Most business owners wedding invitations know well their business preferences as evidenced in wedding invitation their choices of logos,letterhead and exhibits for trade shows. Often, these preferences don’t translate to creative bridesmaid dresses designs of wedding invitation a high professional quality when attempted in-house.

    • CommentAuthorwedding
    • CommentTimeFeb 2nd 2010
     

    So you are thinking about having your wedding during the winter? <a

    href=“http://www.rs2guru.com/”>unique wedding invitations</a>
    If you want a winter wedding, there a few things that you going to need to account for. <a

    href=“http://www.vponsale.com/”>wedding dresses</a>
    With the right planning, winter weddings can be just as beautiful as spring weddings. <a

    href=“http://www.vponsale.com/”>wedding gowns</a>
    In my opinion, they can be more so.<a href=“http://www.vponsale.com/bridesmaid-dresses/”>bridesmaid

    dresses</a> However, with the wrong planning a winter wedding can be ten times more disastrous than the worst

    of springtime weddings. <a href=“http://www.vponsale.com/invitations/”>cheap wedding invitations</a>
    Your wedding could be a winter wonderland or if you are not careful, a winter wasteland.