ScrapyFSharp


Parsing a CSS file

CssParser module can parse and manipulate CSS style sheet.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
21: 
22: 
23: 
24: 
25: 
26: 
27: 
28: 
29: 
30: 
31: 
32: 
33: 
34: 
open System
open System.IO
open ScrapyFSharp
open ScrapyFSharp.CssParser

let css1 = 
    """
#smartbanner { position:absolute; left:0; top:-82px; 
    border-bottom:1px solid #e8e8e8; width:100%; height:78px; }

#smartbanner .sb-container { margin: 0 auto; }

#smartbanner .sb-close { position:absolute; left:5px; top:5px; 
    display:block; border:2px solid #fff; width:14px; 
    height:14px; font-family:'ArialRoundedMTBold',Arial; 
    font-size:15px; line-height:15px; text-align:center; color:#fff; 
    background:#070707; text-decoration:none; text-shadow:none; 
    border-radius:14px; box-shadow:0 2px 3px rgba(0,0,0,0.4); 
    -webkit-font-smoothing:subpixel-antialiased; }

#smartbanner .sb-close:active { font-size:13px; color:#aaa; }

#smartbanner .sb-icon { position:absolute; left:30px; top:10px; display:block; 
    width:57px; height:57px; background-color:white; background-size:cover; 
    border-radius:10px; box-shadow:0 1px 3px rgba(0,0,0,0.3); }

#smartbanner.no-icon .sb-icon { display:none; }

div.block1 {
    background-color: #661133;
    color: black;
}
    """
    |> parseCss

Searching a CSS block from a selector

1: 
let sbCloseActive = css1.Block "#smartbanner .sb-close:active"

sbCloseActive value is

Some {Selector = "#smartbanner .sb-close:active";
      Properties = [{Name = "color";
                     Value = "#aaa";}; {Name = "font-size";
                                        Value = "13px";}];}

Find color property of sbCloseActive

1: 
2: 
3: 
4: 
let color1 = 
    match sbCloseActive with
    | Some p -> p.Property "color"
    | None -> None

color1 value is

Some "#aaa"

Rendering a very simple HTML part

This part is experimental and just for fun. It demonstrates we can implement CSS inheritance and HTML rendering.

Parsing a simple HTML with FSharp.Data:

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
21: 
22: 
23: 
24: 
25: 
26: 
27: 
28: 
29: 
30: 
31: 
32: 
33: 
34: 
35: 
36: 
37: 
38: 
39: 
40: 
41: 
42: 
43: 
44: 
45: 
46: 
let html =
    """<!DOCTYPE html>
<html>
<head>
    <title>Html test 1</title>
    <style type="text/css">
		body
		{
			font-family: "Times New Roman";
			font-size: 12pt;
		}
        div.main
        {
            position: absolute;
            background-color: #a0cc9d;
            width: 200px;
        }
        div.main span
        {
            color: #00FF00;
        }
        div.main span.colorized
        {
            color: #b12121;
        }
    </style>
</head>
<body>
    <h1>Big title 1</h1>
    <div class="main">
        <span id="lorem1" class="colorized">
            Lorem ipsum dolor sit amet, consectetur adipiscing elit.
            Nullam commodo fringilla mollis. 
            Aenean tempor gravida tellus quis elementum. 
            Maecenas finibus lectus id lectus consectetur, id ultricies risus molestie. 
            Nunc vulputate nibh velit, ut posuere arcu hendrerit eget. 
            Cras tincidunt nisl sit amet ultricies dignissim. 
            In consectetur nec odio sollicitudin consequat. 
            Maecenas elit dui, fringilla sed dolor in, scelerisque cursus ipsum. 
            Aliquam risus erat, sollicitudin vel ante vel, tristique venenatis nisl. 
            Morbi in commodo tortor.
        </span>
        <span>text 2</span>
    </div>
</body>
</html>""" |> FSharp.Data.HtmlDocument.Parse

Checking HTML rendering in the Chrome browser:

Image of chrome

Create a windows form an render a the div with class "main" inside it.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
open System.Windows.Forms
open ScrapyFSharp.HtmlCssSelectors
open ScrapyFSharp.CssSelectorExtensions
open HtmlRasterizer

let f = new Form()
f.Width <- 500
f.Height <- 500
f.Paint.Add (
    fun p -> 
        p.Graphics |> HtmlRasterizer.drawHtmlNode html "div.main"
        ()
)
f.Show()

Checking HTML rendering in the winform:

Image of rendergif

We are far a perfect result, but it is ressembling.

namespace System
namespace System.IO
namespace ScrapyFSharp
module CssParser

from ScrapyFSharp
val css1 : StyleSheet

Full name: CssParserTutorial.css1
val parseCss : (string -> StyleSheet)

Full name: ScrapyFSharp.CssParser.parseCss
val sbCloseActive : CssBlock option

Full name: CssParserTutorial.sbCloseActive
member StyleSheet.Block : s:string -> CssBlock option
val color1 : string option

Full name: CssParserTutorial.color1
union case Option.Some: Value: 'T -> Option<'T>
val p : CssBlock
member CssBlock.Property : p:string -> string option
union case Option.None: Option<'T>
val html : FSharp.Data.HtmlDocument

Full name: CssParserTutorial.html
Multiple items
namespace FSharp

--------------------
namespace Microsoft.FSharp
Multiple items
namespace FSharp.Data

--------------------
namespace Microsoft.FSharp.Data
Multiple items
module HtmlDocument

from FSharp.Data

--------------------
type HtmlDocument =
  private | HtmlDocument of docType: string * elements: HtmlNode list
  override ToString : unit -> string
  static member AsyncLoad : uri:string -> Async<HtmlDocument>
  static member Load : uri:string -> HtmlDocument
  static member Load : reader:TextReader -> HtmlDocument
  static member Load : stream:Stream -> HtmlDocument
  static member New : children:seq<HtmlNode> -> HtmlDocument
  static member New : docType:string * children:seq<HtmlNode> -> HtmlDocument
  static member Parse : text:string -> HtmlDocument

Full name: FSharp.Data.HtmlDocument
static member FSharp.Data.HtmlDocument.Parse : text:string -> FSharp.Data.HtmlDocument
namespace System.Windows
namespace System.Windows.Forms
module HtmlCssSelectors

from ScrapyFSharp
module CssSelectorExtensions

from ScrapyFSharp
module HtmlRasterizer

from ScrapyFSharp
val f : Form

Full name: CssParserTutorial.f
Multiple items
type Form =
  inherit ContainerControl
  new : unit -> Form
  member AcceptButton : IButtonControl with get, set
  member Activate : unit -> unit
  member ActiveMdiChild : Form
  member AddOwnedForm : ownedForm:Form -> unit
  member AllowTransparency : bool with get, set
  member AutoScale : bool with get, set
  member AutoScaleBaseSize : Size with get, set
  member AutoScroll : bool with get, set
  member AutoSize : bool with get, set
  ...
  nested type ControlCollection

Full name: System.Windows.Forms.Form

--------------------
Form() : unit
property Control.Width: int
property Control.Height: int
event Control.Paint: IEvent<PaintEventHandler,PaintEventArgs>
member IObservable.Add : callback:('T -> unit) -> unit
val p : PaintEventArgs
property PaintEventArgs.Graphics: Drawing.Graphics
val drawHtmlNode : html:FSharp.Data.HtmlDocument -> selector:string -> g:Drawing.Graphics -> unit

Full name: ScrapyFSharp.HtmlRasterizer.drawHtmlNode
Control.Show() : unit
Form.Show(owner: IWin32Window) : unit
Fork me on GitHub